Arch W. Naylor 
George R. Sell 


Mathematica’ LINC AF 
Sciences Qperator 
Theory in 
Engineering 
and Science 


is) Springer-Verlag 


Arch W. Naylor 
George R. Sell 


Mathematical LING AY 
Sciences Cnerator 
Theory In 
Engineering 
and Science 


} Springer-Verlag 


Applied Mathematical Sciences 
Volume 40 


Editors 
F. John J. E. Marsden _L. Sirovich 


Advisors 
H. Cabannes M. Ghil J. K. Hale 
J. Keller J. P. LaSalle G. B. Whitham 


Applied Mathematical Sciences 


OMDNOMAWND = 


. John: Partial Differential Equations, 4th ed. (cloth) 

. Sirovich: Techniques of Asymptotic Analysis. 

. Hale: Theory of Functional Differential Equations, 2nd ed. (cloth) 
. Percus: Combinatorial Methods. 


von Mises/Friedrichs: Fluid Dynamics. 


. Freiberger/Grehander: A Short Course in Computational Probability and Statistics. 
. Pipkin: Lectures on Viscoelasticity Theory. 

. Giacaglia: Perturbation Methods in Non-Linear Systems. 

. Friedrichs: Spectral Theory of Operators in Hilbert Space. 

. Stroud: Numerical Quadrature and Solution of Ordinary Differential Equations. 

. Wolovich: Linear Multivariable Systems. 

. Berkovitz: Optimal Control Theory. 

. Bluman/Cole: Similarity Methods for Differential Equations. 

. Yoshizawa: Stability Theory and the Existence of Periodic Solutions and Almost 


Periodic Solutions. 


. Braun: Differential Equations and Their Applications, 2nd ed. (cloth) 
. Lefschetz: Applications of Algebraic Topology. 

. Collatz/Wetterling: Optimization Problems. 

. Grenander: Pattern Synthesis: Lectures in Pattern Theory, Vol. I. 

. Marsden/McCracken: The Hopf Bifurcation and Its Applications. 

. Driver: Ordinary and Delay Differential Equations. 

. Courant/Friedrichs: Supersonic Flow and Shock Waves. (cloth) 

. Rouche/Habets/Laloy: Stability Theory by Liapunov’s Direct Method. 
. Lamperti: Stochastic Processes: A Survey of the Mathematical Theory. 
. Grenander: Pattern Analysis: Lectures in Pattern Theory, Vol. Il. 

. Davies: Integral Transforms and Their Applications. 

. Kushner/Clark: Stochastic Approximation Methods for Constrained and 


Unconstrained Systems. 


. de Boor: A Practical Guide to Splines. 

. Keilson: Markov Chain Models—Rarity and Exponentiality. 

. de Veubeke: A Course in Elasticity. 

. Sniatycki: Geometric Quantization and Quantum Mechanics. 

. Reid: Sturmian Theory for Ordinary Differential Equations. 

. Meis/Marcowitz: Numerical Solution of Partial Differential Equations. 

. Grenander: Regular Structures: Lectures in Pattern Theory, Vol. Ill. 

. Kevorkian/Cole: Pertubation Methods in Applied Mathematics. (cloth) 

. Carr: Applications of Centre Manifold Theory. 

. Bengtsson/Ghil/Kallén: Dynamic Meteorology: Data Assimilation Methods. 
. Saperstone: Semidynamical Systems In Infinite Dimensional Spaces. 

. Lichtenberg/Lieberman: Regular and Stochastic Motion. (cloth) 

. Piccinini/Stampacchia/Vidossich: Ordinary Differential Equations in R’. 
. Naylor/Sell: Linear Operator Theory in Engineering and Science. (cloth) 


Arch W. Naylor 
George R. Sell 


Linear Operator Theory 


in Engineering 
and Science 


With 120 Figures 


Springer-Verlag 
New York Berlin Heidelberg London 
Paris Tokyo Hong Kong Barcelona 


To 
Andrée 
and 
Geraldine 


The goal of this book is to present the 
Preface basic facts of functional analysis in a form 
suitable for engineers, scientists, and applied 
mathematicians. Although the Definition— 
Theorem—Proof format of mathematics is 
used, careful attention is given to motivation 
of the material covered and many illustrative 
examples are presented. 

The text can be used by students with 
various levels of preparation. However, the 
typical student is probably a first-year grad- 
uate student in engineering, one of the for- 
mal sciences, or mathematics. It is also pos- 
sible to use this book as a text for a 
senior-level course. In order to facilitate 
students with varying backgrounds, a num- 
ber of appendices covering useful mathe- 
matical topics have been included. Moreover, 
there has also been an attempt to make the 
pace in the beginning more gradual than 
that of later chapters. 

The first five chapters are concerned with 
the “geometry” of normed linear spaces. The 
basic approach is to “disassemble” this geo- 
metric structure first, study the pieces, then 
reassemble and study the whole geometry. 
The pieces that result from this disassembly 
are set-theoretic, topological, and algebraic 
structures. Hence, Chapter 2 covers the ap- 
propriate set theory; Chapter 3 treats topo- 
logical structure, in particular, metric spaces; 
and Chapter 4 handles algebraic structure, 
in particular, linear spaces. The reassembly 
takes place in Chapter 5 where normed 
linear spaces are studied. The main topic of 
this chapter is the geometry of Hilbert 
spaces. 

The authors have found that the ma- 
terial covered in these first five chapters can 
be presented in a one-semester beginning 
graduate course. Indeed, the authors have 
done so a number of times in engineering 


Vlli PREFACE 


and mathematics departments at a number of universities in the United States, 
Europe, and South America. Needless to say, the mode of presentation depends 
upon the audience. For certain audiences, motivation and examples are empha- 
sized while proofs are only highlighted. For others, the converse is the case. An 
attempt has been made to make the book suitable for both modes of presenta- 
tion. Moreover, there is material in the large collection of exercises appropriate 
for each type of audience. 

Chapters 6 and 7 take the geometric structure developed in the first five 
chapters and apply it to the geometric analysis of linear operators. Chapter 6 
covers the Spectral Theorem (the eigenvalue-eigenvector representation) for 
compact operators. Chapter 7 extends this material to certain discontinuous 
operators, in particular it treats those operators with compact resolvents. These 
two chapters also contain many illustrative examples. 

Many chapters are divided into parts (Part A, Part B, and so forth). Part A 
contains basic introductory concepts. The subsequent parts of each chapter 
develop additional concepts and special topics. Thus, if a relatively quick intro- 
duction is desired, Part A can be covered first and material from the rest of the 
chapter can be added as needed. 

For the person who is interested in getting to the spectral theory of linear 
operators as soon as possible it is recommended that he cover Part A of Chapters 
3 and 4, Sections 1-8, 12-24 of Chapter 5, and then Chapters 6 and 7. 

There is an important problem concerning integration theory. Although 
integration theory is not needed to understand the basic material covered, there 
are certain examples that do make reference to the Lebesgue integral and prob- 
ability spaces. This problem can be handled in at least two ways. First, it can 
be more or less ignored. That is, the student can be told that there is such a 
thing as a Lebesgue integral and what its relation to the, presumably familiar, 
Riemann integral is. Probability spaces can be “glossed” over in the same way. 
The other way to approach the problem is to use the appendices. Appendix D 
gives an introduction to Lebesgue integration theory, and Appendix E presents 
the basic facts about probability spaces. 

Each chapter is denoted by a numeral; that is, Chapter 3. The tenth section 
of the third chapter is denoted Section 3.10. However within Chapter 3, the 3 
may be dropped and Section 10 used instead of Section 3.10. Theorem 5.5.4 
(or Definition 5.5.4, Lemma 5.5.4, Corollary 5.5.4) refers to the fourth theorem 
in Section 5 of Chapter 5. 

The notation ‘ff’ is used to denote the end of proofs and examples. This 
allows the proof or examples to be skimmed on first reading. 

The authors would like to thank a number of people who have aided in the 
development of this book. First, there are the students at various universities 
who have taken courses from one or the other of us based upon manuscript 
versions. Their suggestions have been invaluable. Next, we would like to thank 
colleagues who have aided us in various ways: H. Antosiewicz, M. Damborg, 
K. Trani, G. Kallianpur, W. Kaplan, W. Littman, W. Miller, R. Perret, W. Porter, 
T. Pitcher, P. Rejto, Y. Sibuya, H. van Nauta Lemke, and H. Weinberger. We 


PREFACE iX 


especially want to thank F. Beutler for the many suggestions that arose out of 
his classroom use of the manuscript. Finally, we would like to thank the many 
secretaries at various universities who have helped in the preparation of the manu- 
script. In particular, we would like to thank the secretarial staffs of the Depart- 
ment of Electrical Engineering at the University of Michigan and the School 
of Mathematics at the University of Minnesota. 


Ann Arbor Arch W. Naylor 
Minneapolis George R. Sell 
197] 


Preface to the Second Edition 


We are very pleased that the new edition is being published and we are 
grateful to Springer-Verlag for doing this. The number of inquiries that we 
received each year made us believe that a new edition would be welcomed. We 
hope we were right, and we hope that it will be of use to our colleagues and their 
students. 

We further hope, probably unrealistically, that we have corrected all errors of 
the first edition. 


Ann Arbor Arch W. Naylor 
Minneapolis George R. Sell 
1982 


Contents °°” 


Chapter 1 Introduction 


WN 


. Black Boxes 

. Structure of the Plane 

. Mathematical Modeling 

. The Axiomatic Method. The 


Process of Abstraction 
Proofs of Theorems 


Chapter 2 Set-Theoretic Structure 


= 


1. Introduction 

2. Basic Set Operations 
3. 
4 
5 


Cartesian Products 


. Sets of Numbers 
. Equivalence Relations and 


Partitions 


. Functions 


Inverses 
Systems Types 


Chapter 3 Topological Structure 


1. 


Introduction 


Part A_ Introduction to Metric Spaces 


IAN R WN 


Metric Spaces: Definition 
Examples of Metric Spaces 


. Subspaces and Product Spaces 
. Continuous Functions 

. Convergent Sequences 

. A Connection Between 


Continuity and Convergence 


Part B- Some Deeper Metric 


Space Concepts 


. Local Neighborhoods 

. Open Sets 

. More on Open Sets 

. Examples of Homeomorphic 


Metric Spaces 


Vii 


MAR DN NY 


N OQ 


Il 


12 
14 
17 
18 


19 
22 
29 
38 


43 
44 


45 


45 
47 
56 
61 
69 


74 


77 


77 
82 
92 


97 


Xil 


CONTENTS 


12. 


13. 
14. 
15. 
16. 


17. 


Closed Sets and the Closure 
Operation 

Completeness 

Completion of Metric Spaces 
Contraction Mapping 

Total Boundedness and 
Approximations 
Compactness 


Chapter 4 Algebraic Structure 


1. 


Introduction 


Part A Introduction to Linear Spaces 


Linear Spaces and Linear 
Subspaces 

Linear Transformations 
Inverse Transformations 


. Isomorphisms 


Linear Independence and 
Dependence 


. Hamel Bases and Dimension 
. The Use of Matrices to Represent 


Linear Transformations 


. Equivalent Linear 


Transformations 


Part B- Further Topics 


10. 
11. 
12. 


13. 


Direct Sums and Sums 
Projections 

Linear Functionals and the Alge- 
braic Conjugate of a Linear Space 
Transpose of a Linear 
Transformation 


Chapter 5 Combined Topological 


1. 


and Algebraic Structure 


Introduction 


Part A Banach Spaces 


2. 
3: 


4. 


Definitions 

Examples of Normal Linear 
Spaces 

Sequences and Series 


. Linear Subspaces 


101 
112 
120 
125 


134 
141 
159 
160 
161 


161 
165 
171 
173 


176 
183 


188 


192 
196 
196 
201 
204 


208 


213 
214 
215 
215 
218 


224 
229 


CONTENTS  Xili 


6. Continuous Linear 
Transformations 

7. Inverses and Continuous Inverses 

. Operator Topologies 

9. Equivalence of Normed Linear 
Spaces 

10. Finite-Dimensional Spaces 

11. Normed Conjugate Space and 
Conjugate Operator 


oo 


Part B_ Hilbert Spaces 


12. Inner Product and Hilbert Spaces 

13. Examples 

14. Orthogonality 

15. Orthogonal Complements and the 
Projection Theorem 

16. Orthogonal Projections 

17. Orthogonal Sets and Bases: 
Generalized Fourier Series 

18. Examples of Orthonormal Bases 

19. Unitary Operators and Equiv- 
alent Inner Product Spaces 

20. Sums and Direct Sums of 
Hilbert Spaces 

21. Continuous Linear Functionals 


Part C Special Operators 


22. The Adjoint Operator 

23. Normal and Self-Adjoint 
Operators 

24. Compact Operators 

25. Foundations of Quantum 
Mechanics 


Chapter 6 Analysis of Linear Oper- 
ators (Compact Case) 


1. Introduction 
Part A_ An Illustrative Example 


2. Geometric Analysis of Operators 

3. Geometric Analysis. The Eigen- 
value-Eigenvector Problem 

4. A Finite-Dimensional Problem 


234 
243 
247 


257 
264 


270 
272 


272 
278 
282 


292 
300 


305 
322 


33] 


340 
344 


352 
352 


367 
379 


388 


395 
396 
397 
397 


399 
401 


XIV 


CONTENTS 


Part B~ The Spectrum 


5. 


6. 
7. 


The Spectrum of Linear 
Transformations 

Examples of Spectra 
Properties of the Spectrum 


Part C Spectral Analysis 


8. 
9. 
10. 


11. 


12. 


13. 


14. 


Resolutions of the Identity 
Weighted Sums of Projections 
Spectral Properties of Compact, 
Normal, and Self-Adjoint 
Operators 

The Spectral Theorem 
Functions of Operators 
(Operational Calculus) 
Applications of the Spectral 
Theorem 

Nonnormal Operators 


Chapter 7 Analysis of Unbounded 


8. 
9. 


10. 
ja 
[2. 
13. 
14. 


Operators 


. Introduction 

. Green’s Functions 

. Symmetric Operators 

. Examples of Symmetric 


Operators 


. Sturm-Liouville Operators 
. Garding’s Inequality 
. Elliptic Partial Differential 


Operators 

The Dirichlet Problem 

The Heat Equation and Wave 
Equation 

Self-Adjoint Operators 

The Cayley Transform 

Quantum Mechanics, Revisited 
Heisenberg Uncertainty Principle 
The Harmonic Oscillator 


Appendix A_ The Holder, Schwartz, 


and Minkowski 
Inequalities 


Appendix B  Cardinality 


41] 


411 
414 
431 


439 


439 
449 


449 
459 


468 


470 
476 


485 


486 
488 
493 


495 
498 
S05 


510 
516 


523 
527 
533 
539 
54] 
543 


548 


552 


CONTENTS 
Appendix C  Zorn’s Lemma 


Appendix D Integration and 
Measure Theory 


pmo 


. Introduction 

. The Riemann Integral 

. A Problem with the Riemann 
Integral 

4. The Space C, 

5. Null Sets 

6. Convergence Almost Everywhere 

7. 

8 


WW NO 


. The Lebesgue Integral 
. Limit Theorems 
9. Miscellany 
10. Other Definitions of the Integral 
11. The Lebesgue Spaces, L, 
12. Dense Subspaces of L,, 
l<p<o 
13. Differentiation 
14. The Radon-Nikodym Theorem 
15. Fubini Theorem 


Appendix E_ Probability Spaces and 
Stochastic Processes 


_—_, 


. Probability Spaces 

. Random Variables and 
Probability Distributions 

. Expectation 

. Stochastic Independence 

. Conditional Expectation Operator 

. Stochastic Processes 


i) 


Nn BK WD 


Index of Symbols 


Index 


XV 


556 


558 


558 
559 


564 
564 
566 
569 
572 
576 
S81 
586 
589 


S91 
593 
596 
598 


599 
599 
600 
602 
603 


604 
607 


615 


617 


Introduction 


Black Boxes 
Structure of the Plane 
Mathematical Modeling 


The Axiomatic Method. 
The Process of Abstraction 


Proofs of Theorems 


1. BLACK BOXES 


A great number of the mathematical problems of engineering and science 
can be fruitfully viewed as what are often referred to as ‘‘ black box problems.” 
One puts an “input” into a black box (Figure 1.1.1), the black box hums and 


Output 


Input 


Black Box 


Figure 1.1.1. 


whirls inside, and out comes an “‘ output.’ Black box problems are questions about 
what black boxes do. The following are a few examples: 


(1) If a black box is in fact an amplifier, questions can be asked about band- 
width, unit step response, distortion, and so on. 


(2) Givenan autonomous differential equation x = f(x), the initial state (that is, 
the initial conditions) may be viewed as the input and the resulting motion (or 
solution) may be viewed as the output. Many questions can be asked about the 
behavior of such equations; for example, questions about asymptotic growth, 
stability, periodicity, and so on. 


(3) The input data to a digital computer is a string of symbols and its corres- 
ponding output is another string of symbols. The program determines what this 
black box does. 


(4) Let S = {s,,5,,...,5,}$ denote the state set for a Markov chain and 
p(k), k =0, 1, 2,..., denote the probability distribution over S at time &. Further 
let A denote the matrix of transition probabilities; that is, 


p(k +1)=Ap(k) k=0,1,2,.... 


One can view the initial probability distribution p(O) as an input and the resulting 
sequence p(1), p(2),... as the output. Or one can view p(k) as an input and the 
resulting p(k + 1) as an output. 


(5) In the case of a plucked string, the initial stretched position of the string 
before release can be viewed as an input and the resulting string vibration can be 
viewed as an output. 


1.1. BLACK BOXES 3 


(6) In a quantum-mechanical system, the wave function w(x, t) may be 
viewed as an input and the integral | |W(x, t)|? dx or the partial derivative Ow/dt 
may be viewed as the output. 


Needless to say, there is no end to the problems that can be formulated as black 
box problems. 

As far as this book is concerned, the most important aspects of black box 
problems are that, once surface detail is removed, seemingly different problems 
become similar to one another and that certain patterns repeatedly appear in 
solution methods. For example, one does not treat linear time-invariant network 
problems as separate unrelated problems, rather one approaches them as a unified 
class of closely related problems. Similarly, it was noticed long ago that at a certain 
level of abstraction the matrix equation 


a4 Qy4 Ayn xy 
yo | X2 
Vn Qn Gan Xn 


and the integral equation 


T 
y(t) = ip k(t, t)x(t) dt ~—s te [0, T] 


describe similar mathematical situations. Another way to say this is that these 
problems have similar mathematical structures. 

The black box is thus an “‘ operator” which transforms an input into an output. 
It is these operators that form the subject matter of our book. What we want to 
do, then, is recognize and study the essential mathematical structure of these 
operators. Although there are many kinds of operators, our goal here is to study 
those that can, once unessential details are removed, be viewed as transformations 
(rom normed linear spaces into normed linear spaces. This allows us to treat in a 
unified manner, matrix equations, integral equations, differential equations, 
difference equations, and random processes. 

The real Euclidean plane is an example ofa normed linear space. N-dimensional 
I'uclidean space is another. Certain sequence spaces and function spaces are also 
examples. There are many other examples as we shall see later. For us the most 
important fact about normed linear spaces is that they all have a geometric struc- 
(ure that is very similar to ordinary two- or three-dimensional Euclidean geometry. 
This is particularly true for Hilbert spaces, a special subclass of normed linear 
spaces. This geometric structure is the unifying theme of the material presented in 
this book. 

The first part (Chapters 2, 3, 4, and 5) of this book is devoted to a detailed 
study of this geometric structure. It turns out that the geometric structure of a 
normed linear space really involves three different kinds of structure: set-theoretic, 
topological, and algebraic. We illustrate this subdivision in the next section with 
the aid of a familiar example: the plane. 


4 INTRODUCTION 


2. STRUCTURE OF THE PLANE 


The real Euclidean plane is a classic example of a normed linear space. As 
we have noted, it has set-theoretic, topological, and algebraic structure. 


Set-Theoretic Structure 


Before anything else, the plane is a set. In particular, it is the set of all ordered 
pairs of real numbers x = (x,, x,). Denote this set by R?, and note that (7,1) and 
(1,7) are different points in this set. We refer to this set as the underlying set. 


Topological Structure 
The type of topological structure that we are interested in here has to do 
with the concept of closeness. In particular, the Euclidean distance d between any 
two points x = (x,,x,) and y = (j,,y2) is 
A(x,y) = (1x, — Wil? + x2 — yol7}?. 


The set R? equipped with this distance function is an example of what is called a 
metric space. 


Algebraic Structure 


The type of algebraic structure that we are interested in here 1s addition and 
scalar multiplication of points (vectors) in the plane. Thus, if x = (x,,x,) and 


y = (1,92), then ; 
X+y=(%1 +1, X2 + Y2). 
And if « is any real number, 
aX = (4X,,0X>). 


With this structure on the set R* we have a linear space. 


Combined Topological and Algebraic Structure 


It is possible to have metric spaces that are not linear spaces and vice versa. 
As we have just seen here, it is also possible to have both a topological and an 
algebraic structure on the same underlying set. It happens very often that the 
topological and algebraic structure are blended together. In the case of the plane, 
and normed linear spaces in general, this blending is accomplished by means of 
the norm or length of vectors in the plane. If x = (x,,x2), then the norm of x 1s 
given by 
acl] = (ay? + x27)?. (1.2.1) 
It follows that 


d(x,y) = ||x — yll 
and 
lox || == Jol |x, 


1.3. MATHEMATICAL MODELING 5 


where « is any scalar. Neither of the above two expressions would make sense if 
we did not have algebraic structure. We will see later that addition and scalar 
multiplication have continuity properties which are also a result of the blending 
of topological and algebraic structure. 


Geometric Structure 


When we put all the pieces together we are back to the plane with its familiar 
geometry. Some geometric facts are the result of the presence of topological 
structure only, some the result of the presence of algebraic structure only, and some 
involve both. We shall see which are which in the following four chapters. 

Before we go on, it should be noted that the norm in (1.2.1) has some additional 
structure, namely that it is generated by an inner product. The inner product 
between the two vectors x and y in R? is given by 


(x,y) = XyVy + X22. 


Thus ||x|| = (x,x)'/?. It should be noted here that there are other norms one can 
prescribe on R? that are not generated by inner products. We shall see in Chapter 
5 that the geometric structure of spaces with inner products is much richer than 
those without. 


3. MATHEMATICAL MODELING 


Since successful application of mathematics depends on successful mathe- 
matical modeling, it is worthwhile to say a few words about mathematical modeling. 
Roughly speaking, it is the formulation of a mathematical system whose mathe- 
matical behavior models certain aspects of a real system. For example, Ohm’s 
law e = Ri gives a mathematical model for the electrical behavior of a resistor. 

The resistor can be used to illustrate the main problem of mathematical 
modeling. In order to formulate a mathematical model which models many aspects 
of a real system, one is usually led to a mathematical model of great complexity 
and such models are often mathematically intractable. For example, to model the 
high frequency as well as the high voltage behavior of our resistor could require a 
mathematical model involving nonlinear partial differential equations. Such 
equations are notoriously difficult. On the other hand, if one allows simple mathe- 
matical models only, one often ends up with a mathematical model which does not 
yield a sufficiently accurate or detailed description of the real system’s behavior. 
For example, treating a long telephone or power line as a resistor without induc- 
tance and capacitance leads to a simple, yet usually inadequate mathematical model. 

The formulation, then, of a mathematical model is a compromise between 
mathematical intractability and inadequate description of the system being modeled. 
There usually is a choice of mathematical models between these two extremes. For 
this reason, one usually talks about “‘a’’ mathematical model for a system not 
‘the’? mathematical model. 

Another point to be made about mathematical modeling is that it is by no 
means a purely mathematical problem. It has a mathematical side, but it also has, 


6 INTRODUCTION 


for example, a physical or economical side. Indeed, mathematics alone would not 
allow us to arrive at Ohm’s law. We need physics too. Mathematical modeling 
is the interface or bridge between pure mathematics and other disciplines. 


4. THE AXIOMATIC METHOD. THE PROCESS OF ABSTRACTION 


The reader probably had his first encounter with the axiomatic method in 
the study of Euclidean geometry. Since all of mathematics and the subject matter 
of this book, in particular, is based on the axiomatic method, let us recall some of 
the features of axiomatic reasoning. 

In every branch of mathematics one starts out with a collection of “‘un- 
definables.”’ In the Euclidean geometry (of the plane) this includes “ points’? and 
‘‘lines.”? Next, certain properties are stated. These properties (axioms, postulates) 
play the role of mathematical legislation and form the starting point of mathe- 
matical life, or reasoning. While these axioms usually have some basis in intuition, 
it should be emphasized that mathematical reasoning plays no role’ in establishing 
these axioms. Some of the axioms of Euclidean geometry are: (a) the parallel 
postulate, and (b) if Z is a line, then there exists a point not on L. Once the axioms 
have been chosen, one then tries to prove certain properties or theorems. For 
example, congruence or similarity of triangles is a question studied in Euclidean 
geometry. 

The axiomatic method is the method of mathematics, in fact, it is mathematics. 
Even though there are many controversies in the mathematical community over 
the contents of sets of axioms, there is no question over the role of the axioms. 

While the role of the axiomatic method in mathematics has been known for 
centuries, the emphasis of this role that one finds today is something which 
developed only recently. One can see this change by comparing the research papers 
of the last century with those published today. In the past it required very careful 
reading in order to determine the hypotheses needed in order to get a particular 
conclusion. Today, with most papers written in the definition-theorem-proof style, 
it is very easy to determine this. 

Of course, axiomatic systems just do not happen. They must be formulated. 
As mentioned in Section 1, while working on seemingly diverse problems, one 
often finds that similar techniques are being employed. For example, the reader 
may be familiar with the z-transform and Laplace-transform techniques as applied 
to discrete-time and continuous-time systems, respectively. Another example 
would be the techniques used to study the harmonics of a vibrating string and the 
energy levels of the hydrogen atom. It is natural, then, to inquire into the essential 
features (or properties) of these techniques which allow them to be applied in 
different ways. By listing these properties (or axioms) as hypotheses and deriving 
results from them one thereby goes from a concrete problem to a more abstract 


1 We should note that a set of axioms should be consistent, that is, they should not lead to contra- 
dictory statements. This question of consistency is a very important question in mathematical 
logic, but we shall not go into it here. Instead, we refer the reader to Wilder [1]. 


1.5. PROOFS OF THEOREMS 7 


problem. This process of abstraction plays a vital role in the development of 
mathematics because it allows one to gain insight into a larger class of problems. 

This process of abstraction is very similar to the art of mathematical modeling 
discussed earlier. In modeling one seeks mathematical properties which describe 
(or model) a physical reality. In that realm there is a trade-off between finding a 
mathematically tractable model and deriving results sufficiently accurate for the 
physical reality. A similar trade-off occurs in the process of abstraction, namely, 
as one adds axioms (that 1s, becomes more concrete) one can derive sharper results; 
however, these results apply to a smaller class of objects. We shall experience this 
trade-off often in the following chapters. Indeed the next four chapters progress 
by adding axioms. In the next chapter we have the axioms of set theory only; 
consequently, there are few theorems to be proved. However, these theorems are 
widely applicable. Chapters 3 and 4 add more axioms and, hence, more theorems 
can be proven. However, these additional theorems are less widely applicable. We 
end Chapter 5 with a discussion of Hilbert spaces, and there we have a number of 
very sharp and important results. 


5. PROOFS OF THEOREMS 


In the last section we discussed the role of axioms and hypotheses in the study 
of mathematics. However, there is another entity which plays a very important 
role, namely, the rules of logic. 

In mathematical logic (metamathematics) the rules of logic are treated like 
axioms and they are studied as an axiomatic system. We shall not take this view- 
point here. Instead, we shall assume certain rules of logic and use them freely 
throughout. In this section we shall recall some of the more important rules and 
point out their role in the proofs of theorems. 

The reader is undoubtedly familiar with the simple implication: “‘If A, then 
B.” This means that if A is true or valid, then B is true or valid. It is sometimes 
written as “‘A = B.” An example of this is: 

If a is an even integer, then a” is an even integer. 

The “if and only if ’’ statement is a combination of two simple implications. 
That 1s, the statement “‘4 if and only if B”’ is equivalent to both statements “‘ If 
A, then B”’ and “If B, then A.”’ We emphasize this point because beginning students 
sometimes misread the “‘if and only if ”’ statements. Remember, in order to prove 
‘A if and only if B’’ one must prove two things, namely “A => B” and ‘“‘ B= A.” 
The “‘if and only if ”’ statement is sometimes written as ‘A <> B.”’ We also say that 
B is a necessary and sufficient condition for A. An example of an “if and only if ”’ 
statement is 


{x? + 2bx+c>0 forallreal x}<{b? —c <0}, 
where 0, c, and x are real numbers. 


It is customary in mathematics to give a definition in the following format: 


An integer p is a prime number if the only divisors of p are p and 1. 


Even though the word “if™ is used in the defining clause it plays the role of an 
‘if and only if” clause. In our notation, the definition above is equivalent to: 


{p is a prime} <> {p and 1 are the only divisors of p}. 


Let us now consider some of the actual techniques that are used in the proofs 
of theorems. The concept of the direct proof is the simplest. That is, in order to 
prove ‘“‘A = B”’ one assumes A and then derives B. For example, if one wants to 
prove that 


x2>0>x < e’, (1.5.1) 


One would assume that x is nonnegative and show directly that x < e*. (See the 
Exercises.) 

The proof by contradiction is different. For this one uses the “‘fact”’ that the 
statement “If A, then B’’ is equivalent to “If not B, then not A.”’ Actually, the 
equivalence of these two statements is not a fact, but rather a rule of logic. In 
simplest terms, the rule of logic underlying this equivalence is the Principle of 
Contradiction, which reads: 


**Either A or not A.”’ 


We shall accept this principle as one of the rules of logic. It is interesting to note 
that there is an entire school of mathematics, called the “intuitionists” (Wilder 
[1; p. 243]), that does not accept the Principle of Contradiction. 

Proof by contradiction goes as follows: In order to show that A= B, one 
assumes A and not B and then shows that this leads to a contradiction. In other 
words, if not B, then necessarily not A. 

An example of a proof by contradiction occurs when one shows that ./2 iS 
irrational. The statement A is “x =,/2 * and the statement B is “‘x is irrational.” 
Thus, A and not B becomes 


“x a2” and = “x is rational.” 


If x is rational, then x = p/q, where p and q are integers without a common divisor. 
Statement A is equivalent to 


p? = 24’. 


This shows that p? is even. It follows then that p is even. So p? is divisible by 4. 
Hence q is even. Hence2is a common divisor for p and q. But this is a contradiction, 


for p and q have no common divisor. So if x is rational, x#./2, Or, if x =,/2, 
then x is irrational. ; 

Another type of proof which we shall use is proof by Mathematical Induction. 
Underlying this is the following rule of logic, called the Induction Principle. 

Let N = {1,2,...} denote the set of natural numbers (that is, the positive 
integers) and let M be a subset of N. 


If the following two properties hold: 


(1) 1 is in M, and 
(2) if nis in M, then (n + 1) is in M, 


then M=N. 
Let us give an application. Define S,, by 
S,=atart+:::-+ar"', n=1,2,.... 
We then claim that 
Lo 


Ss, = ; Sle Didawes 135.2 
a, (1.5.2) 


In order to prove (1.5.2) we shall let M denote those natural numbers for which 
(1.5.2) is true. If nm = 1, then 


hence, | is in M@. Now assume that 7 is in M and consider S,,,. Since 
Snt+1 = \ + ar", 
we have 


1—r' 


Siti =a + ar", 


1-r 
where the last equality follows from the assumption that n is in M. By use of 
simple arithmetic we see that 


1—r"t! 


Sn+1 =a 


9 


1—r 


so (n+ 1) is in M. It follows from the Induction Principle that M@ = N. Hence 
(1.5.2) has been proved. 


EXERCISES 

1. In order to prove (1.5.1) let f(x) = x and g(x) = e*. Show that f(0) < g(O) and 
f(x) <g'(x) for x > 0. Now prove (1.5.1). 

2. Show that ,/3 is irrational. 


3. Show that the Induction Principle as stated in the text is equivalent to the 
following: Let M be a subset of the integers such that (a) M is not empty, and 
(b)neM=>(n+1l)eM. ThenkeM>meM forallmek. 


4. Let S,=1+2+-°-: +x. Show that S, = n(n + 1)/2 forn=1,2,.... 


10 INTRODUCTION 


5. Let S,=174+2?+-:+4+n?. Show that S,=n(n+1)(n+1)/6 for 
eae 
6. Let h > O and show that (1 + A)" > 1+ nh for n=2,3,.... 


SUGGESTED REFERENCES 


Courant and Robbins [1]. 
Wilder [1]. 


Set-Theoretic 
Structure 


4m 


Introduction 

Basic Set Operations 
Cartesian Products 
Sets of Numbers 


Equivalence Relations 
and Partitions 


Functions 
Inverses 
System Types 


12 
14 
17 
18 


19 
22 
29 
38 


1. INTRODUCTION 


The purpose of this chapter is to present a brief review of certain basic set- 
theoretic concepts. The reader already familiar with these concepts may skim 
the chapter to develop familiarity with the notational conventions and then go on 
to Chapter 3. 

We say that a set X is any well-defined collection of things. These things are 
referred to as the members or elements of X. In this book we are usually concerned 
with collections of numbers, sequences, functions, or, sometimes, collections of 
Sets. 


EXAMPLES 


(1) The set R of all real numbers. 
(2) The set of all sequences of the form 
MS AX EX oA Xi oe dese 
where x,, k = 1, 2,..., is a complex number. 
(3) The set of all sequences of complex numbers x = {x,,x,,...} such that 
>> |x,|7 < 00. 
k=1 


(4) The set C[0,7] of all real-valued continuous functions x defined on the 
closed interval 0 <¢ <T. 


(5) The collection of all closed intervals on the real line. 


We will usually denote sets by capital letters 
ACB. X cies 


We will use R and C to denote the sets of all real and complex numbers, respec- 
tively. The elements or members of a set will be denoted by lower case letters 


Os DX wees 


If x is an element of a set A, we shall write this as x EA. If x is not in A, we shall 
denote this by x € A. 

One way of defining a set is by listing all of its elements. The set A consisting 
of the functions f(t) = ¢, /2(t) = ¢7, and f(1) = ¢° is denoted by 


Am {fifo fs}. 


2.1. INTRODUCTION 13 


Another way of defining a set B is (1) to assume that each element in B is an 
element in some well-defined universal set, say X, and (2) to list the properties that 
elements of the universal set must satisfy in order to be in B. For example, let X 
be the set of all sequences of complex numbers x = {x,,x,,x3,...} and B be all 
elements of X possessing the property 


y> [x,|7 < 00. 
n=1 
We shall use the following notation 
B=|xeXx: y |x,|7 < co, 
n=1 


which is read ‘‘ B is the set of all elements of X such that (the colon stands for 
‘such that”) )°, |x,|7 <0o. When there is no possibility of confusion we shall 
simply write 
eo 
B= fe >. xl" = co, 
n=1 
Abstracting this we see that if a set B is defined by a property P this can be 
written as 
B= {xeX: P}, 
which reads: “‘the set of all x in X such that P is true.’’ Sometimes we define a 
set in terms of two properties P and Q. The set 
B={xeX: Pand Q} = {xe X: P,Q} (2.1.1) 


means ‘“‘the set of all x in X such that both P and Q are true.”’ [The comma in the 
second expression in (2.1.1) is to be read “‘and.”’] For example, 


{xe R:x>landx<2}={xeER: 1 <x < 2}, 
{fxeR:x>landx<0}=@, 


where @ is used to denote the empty set; that is, the set with no elements. 
The set 


A={xeX: Por Q} (2.1.2) 


means ‘‘the set of all x in X such that P, or Q, is true.”’ In this case we use the 
so-called ‘“‘inclusive or’? which means P or Q or both. (The “exclusive or”’ is 
P or Q but not both.) For example, 


{fxEeR:x>2orx<3}=R. 


A set A is said to be finite if it contains a finite number of elements. Otherwise, 
A is said to be an infinite set. A countably infinite set is one containing a countably 
infinite number of elements. A countable set is either finite or countably infinite 
An uncountable set is one that is infinite but not countably infinite. (See Appendix 
B on Cardinality.) 


14  SET-THEORETIC STRUCTURE 


Two sets A and B are said to be equal, written A = B, if they both contain 
exactly the same elements. For example, the sets 


A = {1,2} and B= {x: x? — 3x + 2 = 0} are equal. 


We say that a set A is a subset of a set Bif each element of A is also an element 
of B. We denote this by 
Ac B, 


We also say that A is contained in B. Note that both B and @, the empty set, are 
always subsets of B. If x is an element of a set A, the subset of A containing exactly 
the element x is denoted by {x}. (Note that there is a difference between the 
element x and the set {x}.) If A is not a subset of B, we write 


AGB 


and say ‘‘A is not contained in B.” A set A is a proper subset of a set Bif AC B 
and A # B. If Ac B, we also sometimes say that B is a superset of A. 

Two sets A and B are said to be disjoint if no element of A is in B and no 
element of B is in A. This is illustrated in Figure 2.1.1. 


Figure 2.1.1, 


2. BASIC SET OPERATIONS 


The union of sets A and B is the set made up of elements which belong to 
A or to B or to both. Refer to Figure 2.2.1. We denote the union of A and B by 
Au Bor BU A. In set-theoretic notation, we have 


; 


Figure 2.2.1. AWB Shaded. 


AU B={x:xeA orxe B}. (2.2.1) 


If {A,} is an arbitrary collection’ of sets, then the union of this collection, 
denoted |), A,, is the set made up of all elements x such that x belongs to at least 
one of the sets 4,. 


1 The collection {A,} is indexed with the index « which ranges over some index set, and this 
index set may be finite, countably infinite, or uncountable. 


2.2. BASIC SET OPERATIONS 15 


EXAMPLE 1. Let A,, A,, A3,... besetsof real-valued functions defined on the 
interval 0 < t < 2n. A, is the set of all functions of the form x = a, cos t + J, sint, 
A, is the set of all functions of the form x(t) =a, cos t+ 5, sin t+ a, cos 2t+ 
b, sin 2t, and, A, is the set of all functions of the form x(t) = Sj. , (a; cos it + 
b; sin it), and so forth. It follows that 


00 
|) A, =the set of all functions with a finite Fourier series expansion. J 
n=0 


The intersection of a set A and a set B is the set made up of elements in both 
A and B. We denote the intersection by A 8B or BOA. See Figure 2.2.2. 
Equivalently, 


Figure 2.2.2. AB Shaded. 


AQB={x: xeA and xe B}. 


If {A,} is any collection of sets, then the intersection of this collection, denoted 
(\. A,, is the set made up of all elements x such that x belongs to every set A,. 


EXAMPLE 2. Let A,, A,, A3,... be sets of continuous functions defined by 
] 
A, ={x E€C[0,7]: |x(t)| < 1 + 7 for all i, | oe eee 
Then 


A, = {xe C[0,T]: |x()| < 1 forallt}. J 


1 


18 


The difference of the sets A and B, denoted A — B, is the set made up of all 
elements of A that do not belong to B. See Figure 2.2.3. In other words, 


; 


Figure 2.2.3, A— B Shaded. 


A—B={x:xeA,x¢ B}. 


The symmetric difference of sets A and Bis denoted by A A B and defined by 
(see Figure 2.2.4) 


16  SET-THEORETIC STRUCTURE 


’ 


Figure 2.2.4. A A B Shaded. 


AK B=(A-—B)uU(B- A). 


If X is a universal set and A is a set contained in X¥ then the complement of 
A, denoted by A’, is the set made up of all elements of X that do not belong to A. 
Equivalently, 
={xeX:x€éAS=X-—A 
(see Figure 2.2.5). 


Figure 2.2.5. A’ Shaded, 


AAR } (A UBY 


(a) 


(AN BY B A'U B' 


(b) 


Figure 2.2.6. (a) A’ AB’ =(AUB)’ 
(bd) (ANB) = AUB, 


2.3. CARTESIAN PRODUCTS 17 


Note that X’=@ and @’=X. Moreover, X=AUA’', O=AQA’, and 
(A’)' = A for arbitrary A. 
The following two identities are often useful: 
(AU BY =A’ OB, 
(A 0 BY =A’ UB’. 
(See Figure 2.2.6.) 


De Morgan’s Laws are generalizations of the last two identities and are stated 
as follows: 


(U2 Aa)’ = (a Ae’ 


and 


((\x Aa)’ = Ua Ae’, 


where {A,} is any collection of sets contained in a universal set X. 


3: CARTESIAN PRODUCTS 


An ordered pair is a pair of objects x and y where one of the pair is designated 
as the first member of the pair and the other is designated as the second. We 
denote ordered-pairs by (x,y) with the obvious order. 


EXAMPLE |. In the case of a system with two input channels, for example, 
the inputs to a stereoamplifier, denote the input on channel #1 by x, and channel 
#2 by x,, then the system input is the ordered pair (x,,x,). Note that the system 
input (e ‘, sin ¢) is obviously different from the system input (sin t,e~‘). J 


If A and B are sets, then the set made up of all ordered pairs (a,b), where 
ace A and beB, is referred to as the Cartesian product of A and B. We write A x B 
for the Cartesian product and say “‘A cross B.”” Note that A x B is not the same 
thing as B x A unless A = B. 


EXAMPLE 2. 


(1) Rx R or R? is the set of all ordered pairs of real numbers. 
(2) {1,3} x {7,5} = {01,7), (1,5), (3.7), (3,5)}- 
(3) C[0,7] x CLO,7] is the set of all ordered pairs of functions in C[0O,7]. J 


An ordered n-tuple, (x,,X2,...,X,), is an n-tuple of objects, where one of them 
is designated as the first, one as the second, and so on until one is designated as 
the nth. When n = 3 we talk about ordered triplets. 

If X,, X,,..., X, are sets, we define the Cartesian product X, x X, x°°° 
x X, as the set of all ordered n-tuples (x,,x.,...,x,), where x, € X,,x%,€X>,..., 
x, € X,. Sometimes this is denoted by | ]7..;. 


18 | SET-THEORETIC STRUCTURE 


4. SETS OF NUMBERS 


There are a few sets of numbers which we will use throughout the text. 
These sets are the following: 


The natural numbers N: 
N =I 3A cea 
The integers Z: 
Z = {..., —2,—1,0,1,2, ...}. 
The rational numbers Q: 
O = {x: x = p/q, where pe Z, gE Z, andg#0}. 


The real numbers R. 
The complex numbers C. 


In many ways the most interesting set of numbers is the real number system R 
or, as it is also called, the real line. Defining R exactly is not an issue here. Rather 
we Shall assume R given and merely remark on certain properties of R. 

We remark first that neither + oo nor — oo is a real number. When we adjoin 
+oo and —oo to R we have what is referred to as the extended real numbers. 

A set A c Riis said to be bounded from above if there exists a real number u 
such that a < u for all ae A. The real number u is said to be an upper bound of A: 
We define bounded from below and lower bound in an analogous way. If a set A 
is both bounded from above and below, we say that A is bounded. 

A real number M is said to be the maximum of a set Ac Rif Me A and Mis 
an upper bound for A. A real number m is said to be the minimum of a set AC R 
if me A and mis a lower bound for A. Needless to say, even a bounded set need 
have neither a maximum nor a minimum. 


EXAMPLE |. The numbers | and 0 are, respectively, the maximum and 
minimum of the set A = {xe R:0<x< 1}. The set B= {xe R:0<x< 1} has 
no maximum or minimum, however | and 0 are, respectively, upper bounds and 
lower bounds for B. J 


Let A be a nonempty set that 1s bounded from above, and let U denote the 
set of all upper bounds of A. For example, if A is the interval 0 < x < 1, then 
U={xeR: 1 <x<oo}. It is a fundamental fact about the real number system 
that the set U always has a minimum. This minimum of U is clearly the “‘least 
upper bound” of A. We denote it by sup A and say “‘the supremum of A.” If 
A has a maximum, then clearly max A = sup A. If A is not bounded from above, 
we shall signal this fact by writing sup A = oo. Further, if A is empty, we shall 
write sup A = —oo. With these conventions the supremum exists for any subset 
of R. One sometimes sees l.u.b. A used in place of sup A. 

Next let A be a set that is bounded from below, and let L denote the set of all 
lower bounds of A. Again, as long as A is nonempty L has a maximum. Obviously, 


2.5. EQUIVALENCE RELATIONS AND PARTITIONS 19 


this maximum is the “greatest lower bound” of A. We refer to it as “‘the infimum 
of A” and write inf A. Analogously to supremum, we say that inf A = — oo if A 
is not bounded from below and inf A =oo if A is empty. If A has a minimum, 
then min A = inf A. One sometimes sees g.].b. A used in place of inf A. 

Sup A and inf A, then, are defined for any A c R. We remark in passing that 
a similar statement for the rational number system Q is not true. 


EXAMPLE 2. Let Ac Q, the rational numbers, be the set 


A={xeQ: 0<x<,/2}. 


The set A has neither a maximum nora supremumin@. Jj 


Before we go on let us recall that a set A of complex numbers is said to be 
bounded if the set { |z|: ze A} is bounded in R. 


5. EQUIVALENCE RELATIONS AND PARTITIONS 


¢ 9 


Given a set X one is often interested in “cutting it up” into a family of 
disjoint subsets of X as illustrated in Figure 2.5.1. The technical term is “ partition.”’ 


2.5.1 DEFINITION. A family {A,} of subsets of a set X is said to be a partition 
of X if 


(1) A, 0 Ag = O or A, = Ag, that is, the subsets A, are pairwise disjoint or 
indexed more than once, and 
(2) |), A, = X, that is, the union of the A,’s is all of X. 


EXAMPLE |. Let X = C[0,7] be the set of all continuous functions x defined 
on the interval 0 < t<T. Let {A,} be defined as follows: The index set is R* = 
{a: 0 <a < oo} and 


T 
A, = f e C[0,T]: | Ix|? dt = a 
0 


Figure 2.5.1, 


20  SET-THEORETIC STRUCTURE 
It is clear that A, 7 Ag = @ whenever « # f. It is also clear that one has 
T 
{ Ix|2 dt < 
0 
for all xe C[0,T], so X=|J2_oA,. 


The usual way that one characterizes a partition of a set is in terms of an 
equivalence relation. For this reason we now turn to relations and equivalence 
relations. 

Given a set X, a relation on X can be defined as any subset of the Cartesian 
product X x X. If R is a relation on X and the ordered pair (x,y) is in R, we say 
that “‘x is related to y under the relation R”’ and we write xRy. If the ordered pair 
(x,y) is not in R, we say that “‘x is not related to y under the relation R”’ and 
we write xRy. 


EXAMPLE 2. Let X be a set of three people: Roger, aged 87, Waugh, aged 
25, and Cuthburt, aged 104. Let R be the relation on X defined by xRy if and only 
if x is younger than y. The subset R of X x X will then be made up of the ordered 
pairs (Waugh,Roger), (Waugh,Cuthburt), (Roger,Cuthburt). J 


A relation R on a set X is said to be reflexive if for each x € X, one has xRx; 
that is, each x is related to itself. 


EXAMPLE 3. Let X be the real line. The relation R on X defined by 
xRy~x<y 


is reflexive. J 


A relation R ona set X is said to be symmetric if xRy implies that yRx. That 
is, if x is related to y, then y is related to x. 


EXAMPLE 4. Let _X be the real line, and let R be the relation on X defined by 
xRy = |x| = |y| 


R is symmetric and reflexive. J 


A relation R on the set X is said to be transitive if xRy and yRz imply that 
xRz. In other words, if x is related to y and y Is related to z, then x is related to z. 


EXAMPLE 5. Let X be the real line, and let the relation R on X be defined by 
xRy<x<y. 


R is transitive, but not symmetric nor reflexive. J 


We are now ready to define equivalence relation. 


2.5. EQUIVALENCE RELATIONS AND PARTITIONS 21 


2.5.2 DEFINITION. A relation R on a set X is said to be an equivalence 
relation if it is reflexive, symmetric, and transitive. 


If xRy, where R is an equivalence relation, we will say that “x is equivalent 
to y” and write x ~ y. Similarly, if xRy, we say that ‘‘x is not equivalent to y” 
and write x ~ y. Needless to say, the notation “‘x ~ y”’ can lead to confusion if 
more than one equivalence relation is under consideration. 


EXAMPLE 6. We refer to Example 1 and note that the relation R on X¥ = 
C[0,7] defined by 


T T 
xRye| [x(NPdr=] |v? at 
0 6) 
is an equivalence relation. jf 


We now come to the main point of this section: There is an intimate and 
natural connection between partitions and equivalence relations. In particular, 
any partition on a set X naturally determines an equivalence relation on X and 
vice versa. 

Before stating the basic two theorems, we need the concept of an equivalence 
class. If R is an equivalence relation on a set X and xe X, then the equivalence 
class determined by x is the subset 


C.={yeX: y~ x}; 


that is, C, is the set of elements of X that are equivalent to x. We sometimes 
denote C,, by [x]. 


EXAMPLE 7. (Continuing Example 6) 
T T 
C,= f e C[0,T]: I, ly(t)|* dt = i |x(t)|? at} | 


2.5.3 THEOREM. Let R be an equivalence relation on a set X. Then the family 
C of all equivalence classes is a partition of X. 


Proof: Let C, and C, be any two equivalence classes. We want to show that 
either C, = C, or C, A C, = @. Suppose first that C, and C, are not disjoint. 
Let we C, O C,. By definition of equivalence class, wRx and wRy. Let z be any 
point in C,., that is, zRx. We can then use the following line of argument: 


ZRx, wRx, wRy => zRx,xRwwRy _ (by the symmetry of R) 
=> zRw, wRy (by the transitivity of R) 
=> ZRy (by the transitivity of R) 
= CeCe Ce C,. 


A similar argument shows that C,< C,, so C, = C,. Thus the only possibilities 


22  SET-THEORETIC STRUCTURE 


are C, = C, or C, A C, = ©. Since R is reflexive, we have xe C, for all xe X. It 
follows, then, that¥ =(J,C,. J 


EXAMPLE 8. Let L,[a,b] denote the set of all real-(or complex)- valued 
functions x such that 


b 
| Ix(t)|? dt < 00. 
We define the relation R by 


b 
xRy | |x(t) — y(t)|* dt =0. 
It follows that? 
x ~y = {The set of points t, for which x(t) y(t), has measure zero}. 


The equivalence class C, containing the function that is identically zero is a set 
made up of a huge number of functions each of which is zero almost everywhere. J 

The next theorem provides the other half of the connection between partitions 
and equivalence relations. 


2.5.4 THEOREM. Let {A,} be a partition of a set X. Then the relation R defined 


by 
xRy < x is in the same partition subset as y 


is an equivalence relation on X. 


The proof of this theorem follows immediately from the definition of a partition. 


6. FUNCTIONS 


Most readers will have encountered real-valued functions of a real variable 
early in their training. Recall that such a function is a rule f which associates with 
each real number x another real number denoted by f(x). However, the notion 
of a function is basically a set-theoretic concept and does not depend on the real 
numbers. 

Suppose we have two sets X and Y. Suppose further that we have a rule f 
which assigns to each element in X exactly one element of Y. Then we say that 
fis a Y-valued function defined on X, or fis a function defined on X with values in Y. 
The terms mapping, transformation, and operator are sometimes used in place 
of function. We also say that fis a function which transforms, or maps, X into Y, 
and we will denote this by f: X¥—> Y. 

An important point of notation: When we write f or f(-), we mean the function 
itself. When we write f(x), we mean the element in Y assigned to the element x in_X. 


2 The reader who is unfamiliar with the basic concepts of measure and integration theory can 
study these concepts in Appendix D. Roughly speaking, a sct of measure zero is either finite or an 
infinite set whose ‘‘ total length” (that is, measure) is zero. 


2.6. FUNCTIONS 23 


If f: X-> Y,g: X— Y, and g(x) = f(x) for each x € X, then we say that f= g, 
that is, they are one and the same function. We remind the reader not to confuse 
a function with its representation. That is, f(x) = |x| and g(x) = exp (log|x|) are 
representations of the same function. 

If fis a function defined on a set X, we say that the set XY is the domain of f 
and we write D(f) = X. A function g such that D(f) < P(g) and g(x) = f(x) for each 
x in Df) is said to be an extension of f. Next let f be a function with D(f) = X, 
and let A be a subset of X. A function A such that D(A) = A and h(x) = f(x) for 
each x in A is said to be the restriction of f to A. Sometimes h/ is denoted by f|,. 

Let f: X—> Y be given. If y = f(x), we say that y is the image of x or that 
y is the value of fat x. Also we say that x is a pre-image of y. We cannot say the 
pre-image, for there may be more than one x with y as its image. The set of all 
elements of Y that are images under f of elements of X is said to be the range of 
fand written 2(f); that is, 


Af) ={ye Y: y=f(x) for some x in X}. 


IFA) < Y, wesay that f maps X into Y. IfB(f) = Y, we say that f maps YX onto 
Y. IfA(f) contains only one point, we then say that f is a constant function. 

The mapping J of a set X onto itself given by /(x) = x for all x in X is said 
to be the identity mapping on X. 

If the domain of f does not contain two elements with the same image; that 
is, if 


[f(x%1) =Sf(*2)] > by = x2], 


then we say that fis a one-to-one mapping. Note that fis a one-to-one mapping 
if and only if every point y in @(f) has precisely one pre-image point in AS). 
The real-valued functions y = x* and z=e~” induce a mapping of x into z 
yiven by z =e *. This composition of two functions has an abstract formulation. 
let f: X¥— Y and g: YZ be given. We define the composition of g and f by 
(af)(x) = g(f(x)). This mapping is defined on X with values in Z. The composition 
of g and fis illustrated in Figure 2.6.1. We note that one can meaningfully write 


Figure 2.6.1. 


uf or say that gf exists if and only if D(g) > A(f). If this is not so, one says 
(hut gf does not exist. It may be that gf exists, but fg does not. Give an example. 

If f, maps a set X, into set X,4,,”=1,2,..., N, the composition of f,,..., 
/y is defined by 


24  SET-THEORETIC STRUCTURE 


Cfyiyas SO) = fnt AoA}... 


for each x € X,. We leave it to the reader to show the associative law, that is, 


Cfwiy-1 °° Si Seaa A) = Sf -1 °° f, for arbitrary 1 <k <Q. 
Let us recall what we mean by the graph of a real-valued function of a real 


variable. Consider the function y = (x + 1)*. The graph of this function is a subset 
of the plane R*. A point (x,y) of the plane lies in the graph if and only if y = 
(x + 1)*. Now consider the general case. Let f: ¥ — Y. The graph of the function 
f is the subset of the product set X¥ x Y made up of all ordered pairs (x,y) such 
that y = f(x); that is, 


Gr(f) = graph (f) = {a y)E Xx Vi y =f(x)}. 
Needless to say, it is not often that one can “ plot” this graph. 
Let us consider some illustrations of these concepts. 
EXAMPLE |. Let X = Y = R", the set of all n-tuples of real numbers. Given 
ann xn real matrix A, the matrix equation 
y = Ax x, ye R" 
represents a mapping of X into itself. In Chapter 4 we will investigate the conditions 
under which this mapping is one-to-one and/or onto. J 
EXAMPLE 2. Given a real number a, let C, denote the vertical line in the 
complex plane defined by 
C, = {s: Re(s) = oc}. 


Let L, ,(—io,ioo) be the set of all complex-valued functions x(s) defined on C, 
such that 


| 2 
ani J, bo ds < 0. 


Let t(s) denote a bounded continuous complex-valued function defined on C,. 
Then 
y(s) = t(s)x(s) for allseC, 


represents a mapping of X into itself. J 


EXAMPLE 3. Let X =L,(—00,00) and consider the electrical network of 
Figure 2.6.2 where the inputs x are in X. Letting a = 1/RC, a mathematical model 
for this network is given by 


t 
y(t) = { h(t — t)x(z) dt, (2.6.1) 
where A(t), the unit impulse response or weighting function, is given by 


1 
h(t)=-—e™", 
a 


2.6. FUNCTIONS 25 


Figure 2.6.2. 


We state without proof that (2.6.1) represents a mapping f of X into itself. It is 
shown in Exercise 1, Section 4.4, that fis one-to-one. J 


EXAMPLE 4. Let X¥ = N x N, the Cartesian product of the set of natural 
numbers with itself. Let Y= {0,1}. The mapping 0,;;; X¥> Y, i,j 1, 2,..., 
defined by 


— l, ifi=/ 
J \0, ifi¢s 
is called the Kronecker function. 
It should be noted that there is nothing sacred about the set N in this example. 
It could be replaced by any set, finite, infinite, even uncountably infinite. J 


EXAMPLE 5. C"[a,b], where n > 0 is an integer, denotes the set of all real- 
valued (or complex-valued) functions x defined on the interval a<t<b, such 
that x is continuous, and the derivatives d*x/dt" of order k <n exist and are 
continuous on [a,b]. C”[a,b] denotes the set 


C*[a,b] = ()) Cab]; 


(hat is, x e C”[a,b] if and only if x has continuous derivatives of all orders. Let 
P(D) denote the differential operator 


ap gq" n-1 - d 
( ) = Oy aa + On Game + ° Bes ee Fanaa 
where the «’s are constants. 

It follows that P(D) can be considered as a mapping of C”[a,b] into itself. 
It can also be considered as a mapping of C"[a,b] into C°[a,b] = C[a,b]. 

The interval [a,b] in this example can be replaced by an arbitrary (even infinite) 
interval. For that matter we could replace it by a set Q in R”. In the latter case, 
C"(Q) would denote the collection of all functions u = u(x,,..., x,,) defined on 
Q with the property that all partial derivatives up to order n are continuous. Also, 
CQ) = [ \per CQ). 

One sometimes refers to the fact that ue C"(Q) by saying that u is a C"-function 
on Q or that u is of class C". jj 


26  SET-THEORETIC STRUCTURE 


EXAMPLE 6. Sometimes functions are defined implicitly. For example let 
k(t,s) be a continuous function defined for 0 < s < t < T and consider the Volterra 
integral equation 


y(t) = x(t) + [ Gesyavas: (2.6.2) 


It is shown in Exercise 3.15.7 that there is a function F: C°[0,a]— C°[0,a] where 
y = F(x) satisfies (2.6.2) provided « is a sufficiently small positive number. One 
can also show that F is one-to-one, see Exercise 2, Section 4.4. jj 


Set Functions 


If f maps a set X into a set Y, then there is a mapping ¥ naturally associated 
with f which maps subsets of XY into subsets of Y. Let P(X) denote the set of all 
subsets of X. Then ¥: P(X)— P(Y) is defined by 


F(A) = {ye Y: y=f(x) for some x in A}, 
where A € P(X). Since F(A) is a subset of Y, it is an element of P(Y). For instance, 
F(X)=A(f), the range of f. 
The mapping ¥ maps P(X) onto P(Y) if and only if f maps X onto Y. ¥ 
is one-to-one if and only if fis one-to-one. (Why ?) Moreover, 
F(G) = SG, the empty set 
F(A, VU Az) = F(A) VU F(A2) 
F(A, CY A,) F(A;) C\ F(A;). 
EXAMPLE 7. Let M be a mass initially at rest at a point A on a frictionless 
plane as shown in Figure 2.6.3. Assume that a force x(t) is applied to M so that 
it moves along a straight line in the plane. The distance along this line measured 


from the point A denoted y, and the velocity of M is denoted by v = dy/dt = y. If 
it is assumed that C[0,0o) is the set of allowable x’s, then it is clear that 


1 t ,T 1 t 
y(t) = i. [x d0 dr = — Kc — 0)x(0) dé, 


where the last integral is obtained by a simple interchange of order of integration, 
and 


Rest Position 


Force 
x(t) — —?| M 
y 
A 
y=0 
dy _ 
dt 
Att=0 


Figure 2.6.3. 


2.6. FUNCTIONS 27 


v(t) = - fx) dé 


define a mapping f: C[0,00) + C[0,0o) x C[0,0o) where an element in the range is 
the ordered pair of functions (y,v). That is, (y,v) = f(x) where y = displacement 
and v = velocity. 

The mapping ¥, then, is a mapping of subsets of C[0,co) into subsets of 
C[0,co) X C[0,co). For example, consider the set U of all inputs x(t) such that 
|x(t)|<u, where u is a positive real number. It is not too difficult to show that 
F(U) is characterized by the set of all (y,v) in C[0,00) X C[0,00) such that 


(1) y, y, ¥ are in C[0,0o), 
(2) v and ¢ are in C[0,0o), 
(3) p=, 

(4) |v(t)| < tu/M, 

(5) |y(t)| < t7u/2M. J 


Now that the reader understands, we hope, the difference between F and f, 
we will henceforth economize on symbols and dispense with the symbol F¥. It 
is customary to write f(A) instead of F(A). 

In addition to the function F, there is another mapping of subsets into sub- 
sets which is naturally associated with f. It maps P( Y) into P(X). For the moment 
we will denote it F '. We define F ‘ as follows: 


FF (C) = {xe X: f(x) €C}, 


where Ce P(Y). That is, F~*(C) consists of all pre-image points of points in C. 
The mapping ¥~' is referred to as the inverse set function or inverse image 
mapping and the set ¥~‘(C) is said to be the inverse image of C. Note that 


FD) =D, 
FMC, U Cy) = FC) VU F""(Cp) 
F-M(Cy AC) =F (CQ) AF (Cd). 
Also note that 
F-“(A(f) 0 C) =F (C) 
for arbitrary C. Although the symbols might suggest it, A~' is not necessarily the 
inverse* of F. We note that ¥ ~! always maps P(Y) onto P(X). F ~' is one-to-one 


if and only if fis one-to-one and @(f) = Y. Again to economize on symbols, we 
will henceforth write f~'(C) instead of F ~+(C). 


EXAMPLE 8. Suppose that in Example7 we let A be a set in C[0,00) x C[0,00) 
defined by 


A term yet to be defined, but probably not new to the reader. 


28 | SET-THEORETIC STRUCTURE 


4u=\st of all motions of the mass such 
~ \that at time T > 0, p(T) = b, o(T) = OF 


It then follows that 


1 «pt 1? 
f(A) = [= E C[0,00): 7; i; (T — 0)x(@) d@ = band M j, x(0) d0 = 0} i 


EXERCISES 


1. Consider the space C[0,7']. In electronics a clipper circuit is a circuit whose 
output waveform is the input waveform (that is, voltage or current as a function 
of time) with the “top and bottom clipped off.’’ More precisely, if x(t) is the 
input waveform, then the output waveform z(t) is given by 


A, ifx(t)>A 
z(t) = fe if-—-A<x(t)h<A (2.6.3) 
—A, if x(t) < —A, 


where A> 0 is sometimes referred to as the clipping level. A circuit which 
would approximately realize this behavior is shown in Figure 2.6.4. 


+A volts -A volts 


x(t) Z(t) 


Figure 2.6.4, 


(a) Does the statement (2.6.3) describe a mapping of C[0,7] into C[0,7]? 
(b) If so, characterize the range of the mapping. 
(c) If so, is the mapping one-to-one? 


Digital 


eS ea | . iv 
(x1. X2, ¥3) te MI ) Red light 


meer ree eese 
——_OoOorrFoD 
—-oOo-0o-o-o 


Figure 2.6.5. 


2.7. INVERSES 29 


2. Let X be the set of all ordered triplets of 1’s and 0’s as shown in Figure 2.6.5. 
Suppose we have a digital device into which we feed a point from X. The device 
is so constructed that if x =(1,0,1) is the input, a red light lights; otherwise, 
the light remains off. Characterize this situation as a mapping of the set X 


into an appropriate set Y. Is this mapping one-to-one? Is this a mapping of X 
onto Y? 


Digital 


{X1, X2, X3} Device {Vi V2. ¥3 } 


V, =V,(% 1, Xa, X53); l <i< 3 


Figure 2.6.6. 


3. Let XY be the same as in Exercise 2, and let Y = XY. Now suppose that we have 
a proposed digital device which is to operate as follows (see Figure 2.6.6). At 
noon of day number one, x, is put into the machine and y, comes out of the 
machine almost instantaneously. At noon of the second day, x, goes in and y, 
comes out. Similarly, on the third day x, goes in and y, comes out. The 
following table characterizes the desired operation. 


Input | Output 


0,0,0 1,1,1 
0,0, 1 1,0,1 


0,1,0 0,0, 1 
0,1,1 0,0, 1 
1,0,0 1,1,0 
1,0,1 0,0,0 
1,1,0 1,1,1 
1,1,1 1,0,0 


Have we characterized a mapping of X into itself. Obviously, yes. Is this mapping 
one-to-one? Is it onto? We will return to this exercise in Section 8. 


7. INVERSES 


The next topic, inverses, is simple yet extremely important. Roughly speaking, 
a mapping F: X > Y is invertible if there exists a mapping G: Y- X which un- 
ravels F. Let us make this idea precise. 


2.7.1 DEFINITION. A mapping F: X— Y is said to be invertible if there 
exists a mapping G: Y— X such that GF and FG are the identity mappings on 
the sets X and Y, respectively. In this case G is said to be an inverse of F. 

The next few lemmas introduce some important facts. 


30 SET-THEORETIC STRUCTURE 
2.7.2 LEMMA. Jf a mapping F: X > Y is invertible, it has a unique inverse. 


Proof: Assume that G, and G, are inverses of F, and let y be any point in 
Y. Since FG, and G, F represent the identity on Y and_X, respectively, one has 


G,(y) = G,(FG,(y)) = G,F(G,(y)) = G,(y). 
Hence G,=G,. J 


Because of this lemma, we shall denote the inverse of F by F~'. 


2.7.3 LEMMA. Jfa mapping F: X— Y is invertible, then F~' is invertible and 
(F~')-'= F, 


The validity of this lemma is obvious. 


EXAMPLE 1. Let YX be the set of all real-valued C! functions x defined on 
the closed interval 0 < t < T such that x(0) = 0. Let Y be the set of all real-valued 
continuous functions x defined on 0 < t < T. If Fis defined by y = dx/dt, then 


F-'(y)=| y@)dt. 9 
0 


2.7.4 LEMMA. Jf a mapping F: X > Y is invertible, then F is one-to-one. 


Proof: Suppose F is invertible. Recall that F is one-to-one if every point 
yin Y has precisely one pre-image x in X, that is, there is precisely one point x in 
X such that F(x) = y. If x, and x, both satisfy F(x,) = F(x,) = y, then 


x, = FUlMF(x,) = Fo'F(x2) =). 


Hence F is one-to-one. J 


2.7.5 LEMMA. Jf a mapping F: X —> Y is invertible, then @(F) = Y. 


Proof: If yeY,then y = FF~'(y) = F(x), where x = F~‘(y). Hence ye A&(F), 
or A(F)=Y. § 


Building upon Lemmas 2.7.4 and 2.7.5, we arrive at the following elegant 
characterization of invertible mappings. 


2.7.6 THEOREM. A mapping F: X > Y is invertible if and only if it is one-to-one 
and @(F) = Y. 


Proof: Lemmas 2.7.4 and 2.7.5 furnish the proof for the “ only if”’ part of 
the theorem. We need to prove the “‘if’’ part. We assume that F is one-to-one 
and AF) = Y and show it follows that F is invertible. We do so by exhibiting the 


2.7. INVERSES 3] 


inverse F~*. Let y be an arbitrary element of Y. Since A(F) = Y and F is one-to- 
one, y has precisely one pre-image x in XY. We define G( y) to be x. We then have a 
mapping G: Y— X and it is clear that FG and GF are the identity mappings on 
Y and_ X, respectively. Hence F is invertible. J 


An invertible mapping of X onto Y is sometimes called a one-to-one corre- 
spondence between X and Y, for it puts the elements of XY and Y into a one-to-one 
correspondence. 

Before the reader becomes too enamoured with Theorem 2.7.6, it should be 
noted that in practice determining whether or not a given mapping is one-to-one 
or onto is often a difficult task. 

Suppose F: X— Y is one-to-one but AF) # Y. Then F is not invertible. 
However, if we replace Y with @(F), then F: X >&(F) is one-to-one and a mapping 
of X onto A(F). In this case F is invertible, and this inverse is also denoted by 
F~', However, it should be emphasized that the domain of this inverse is B(F) 
and not Y. 

In the last section we introduced a simplified notation for the inverse set- 
function associated with a mapping F: XY > Y. Recall that this was denoted by F~'. 
Needless to say, this is apt to be confused with the inverse as defined in this section. 
However, these notational conventions are so standard it does not pay to tamper 
with them here. Furthermore, the usage will always be clear from context. Two 
points should be emphasized: (1) the inverse set-function is defined for every 
mapping F: X — Y, (2) the inverse is defined only for certain mappings F: X > Y. 


EXAMPLE 2. Suppose that ¥ = Y=], is the set* of all infinite sequences 
x = {x,,x,,...} of real (or complex) numbers such that 


co 
Sle ee. 
k=4 


Consider the sequence y = {y,,y,,...}, where y, = A, x, and |A,| < M, a constant, 
for k=1,2,.... The following infinite matrix equation is a convenient way to 
represent the dependence of y on x. 


Jj A O | | x, 
D2 = Ay X2 
0 


If the sequence y is in J, for arbitrary x in /,, then a mapping of /, into itself has 
been described. But 


N N ro) 
> Lyel? <M’ DE Ix,l? <M’ me Ix,1?. 
k=1 k=1 k=1 
Hence, y is in /,. Denote this mapping of /, into itself by A. 


“We will occasionally denote this set by /,(0,00). The symbol /,(— 00,0) will denote the set of 
doubly infinite sequences x = (...,%-1,X%0 ,X1,X2,...) such that Ur w [Xa|? < 00, see Example 2.8.1. 


32  SET-THEORETIC STRUCTURE 


It is easily shown that A is one-to-one if and only if A, # 0 for all k. Indeed, we 
want to show that A(x) # A(z) whenever x ¥ z. That is, the sequence {/,(x, — Z,)} 
is not a sequence of zeros whenever x, # Zz; for at least one index j. The latter 
statement is now obviously the case if and only if 4, #0,k =1,2,.... 

If |A,| =m > 0 for all k, then BA) =/,. Let y = {y,,y,., ...} be an arbitrary 
sequence in /,, and let x = {A, ~"y,,4,7'y,,...}. If x is in /,, then it is clearly a 
pre-image of y and hence &(A) = /,. However, 


N N 
sy x, |? = ‘ 
k=1 k=1 


which shows that x is in /,. 
On the other hand, we claim that if |A,| is not bounded away from 0, then 


RA) # 1,. (2.7.1) 


To prove this we first note that (2.7.1) is true if one or more /,’s are zero. Thus we 
assume that A, #0 for all k and A,,-+0 for some subsequence k;. By taking a 
further subsequence we can choose the k; so that |A,,| < 1/i. We now claim that 
the sequence y € /, defined by 


1 
ee 


_ fiji, ifk =k, 
*\0, otherwise 


is notin &(A). Indeed, if ye A(A), then its pre-image would have to be x where 
\x;.,| = |Ax, *ye,| = 1. Since there are an infinite number of such k;, one has 


00 
y [x17 = ©. 
k=1 


Hence x ¢1,, and (2.7.1) is valid. 

We see, then, that A: /, +1, can be one-to-one with range @(A)# /,. In that 
case, A is not invertible. However, if we restrict our attention to the range of A, 
the following matrix equation represents A7!: 


xX ] vi 
x2 a 0 Hy 
A> 
0 


We now turn to the concepts of left and right inverses. 


2.7.7 DEFINITION. A mapping F: XY is said to be Jeft invertible if a 
mapping G: Y— X exists such that GF is the identity mapping on the set YX. 


2.7.8 THEOREM. A mapping F: XY is left invertible if and only if it is 
one-to-one. 


2.7. INVERSES 33 


Proof: Consider the ‘‘if’’ first. Assume that F: X > Y is one-to-one. Then 
F: X + @F) is invertible, with F~!: @(F)— X. Let G: Y> X be any extension 
of F~'. It follows that GF is the identity on X. 
The converse, or the “‘only if”’ part, is proved in the same manner that we 
proved Lemma 2.7.4. We ask the reader to check the details. J 


If F: X > Y is left invertible, the mapping G given in Definition 2.7.7 is said 
to be a /eft inverse of F. The notation we shall use for this is G = F,*. It should 
be clear from the proof that if@(F) # Y, then G is not unique. That is, F has many 
left inverses. However, the restriction of any left inverse G to A(F) must be 
F-':2(F)— X, that is, Gl ap) = F~*. This restriction is unique by Lemma 2.7.2. 


EXAMPLE 3. Returning to Example 2, one left inverse of A is as follows: 
Given yel, 


i Vy 
Ay 0 y2 
A, t(y) = — ; if a2 <0, 
: = 
0 
0, otherwise. 


What are some other left inverses for A? J 


2.7.9 DEFINITION. A mapping F: X¥— Y is said to be right invertible if a 
mapping H: Y— X exists such that FH is the identity mapping on the set Y. The 
mapping H is called a right inverse of F and denoted F.~'. 


2.7.10 THEOREM. A mapping F: X > Y is right invertible if and only if 
RF) = Y. 


Proof: Let us consider the “‘if”’ part of the theorem. Assume that 2(F) = Y 
und let yo be an arbitrary element of Y. Since yo € AF), yo has at least one pre- 
image in YX, that is, the set F~ ‘({y,}) is not empty. Moreover, if y, # y2, the sets 
F-'({y,}) and F-'({y,}) in X are disjoint. We define a mapping H: Y Y by 
selecting an arbitrary representative (call it x) from each set F~‘({y}), and letting 
x = H(y). It is clear that FA is the identity on Y. 

The converse, or the “‘ only if” part is proved in exactly the same manner that 
we proved Lemma 2.7.5. We ask the reader to check the details. J 


It should be noted that if A(F) = Y and if F is not one-to-one, then F will 
have more than one right inverse. 


34 SET-THEORETIC STRUCTURE 


The results of this section are summarized in Figure 2.7.1. 


Given 
FX -RF)CY 


Yes No 


Is F one-to-one? 
Fhasa 
left inverse 


F has an 
inverse defined 
on its range 


Yes F onto, that is No Yes F onto; that is, No 
R(F)=Y? R(F)=Y? 
F has an F has only F has only ae HasDe 
; F : ; inverses of 
inverse left inverses right inverses 
any kind 


Figure 2.7.1, 


EXAMPLE 4. Let ¥ = R* and Y = R* be the sets of all ordered quadruples 
and all ordered pairs of real numbers, respectively. The following matrix equation 
represents a mapping of X into Y. 


one Ces Gs ne 
yo} jl -1 -1 l}] x3 


The range of the mapping is Y, and therefore, it has a right inverse. One right 
inverse can be represented as follows. 


x4 4 4 
a {|e [bil 
X4 4 4 


Are there any other right inverses? J 


EXAMPLE 5. Let X and Y be intervals in the real line R. A function f: X +~ Y 
is said to be monotone if 


2.7. INVERSES 35 


(1) s<t=>f(s) <f(t), or 

(2) s<t>f(s) 2 f(t). 
If (1) holds one says that fis monotone increasing. lf (2) holds f is monotone decreas- 
ing. If one has 

(3) s<t=>f(s) <f(0), or 

(4) s<t=>f(s)>f(?), 
then fis said to be strictly increasing, or strictly decreasing, respectively. It is easy 
to see that if f is strictly increasing, or strictly decreasing, then f is one-to-one. 


When this happens, f~': @(f)— X is defined and moreover f~' is also strictly 
monotone. J 


EXERCISES 


1. Suppose? we have a “‘ black box,” 7, as shown in Figure 2.7.2, whose input and 
output are the voltage waveforms m(t) and c(t), respectively. Assume that a 
mathematical model for this black box is the mapping of L,(— 00,00) into 
itself represented by 

t 
c(t) = | e~“"9m(t) dt. 


— 0 


Figure 2.7.2. 


Further, suppose we have another “‘ black box,”’ K, for which a suitable mathe- 
matical model is the mapping of L,(— 00,00) into itself represented by 


mt) = ks(t), for —0o <t<o, 


where s and m are the input and output of K, respectively, and k is a real number. 

Suppose we have K and 7 interconnected in a closed loop system as shown 
in Figure 2.7.3. One can view r and c as the system input and output, respec- 
tively. The function s(t) is r(t) — c(t) and is sometimes referred to as the error. 
It follows from the above diagram that given an error séL,(— 00,00), the 
corresponding system input re L,(— 00,00) is given by 


r(t) = s(t) +k f e~ !"s(z) dt. 


Denote this mapping of L,(— 00,00) into itself by F; that is, r = F(s). 


5 The next four exercises have important application in stability theory, see Damborg and Naylor [1]. 


Figure 2.7.3. 


Usually we are interested in determining how s depends on r; that is, we 
are interested in F~', if it exists. 

Show by simple substitution that, for 1+k>0, F~* exists and can be 
represented® by 


t 
s(t) = r(t) — k { e~AtWE-On(q) de, 


Show again by substitution that for 1+k <0, F~’ exists and can be 
represented by 


s(t) = r(t) + k | eo LFW Dy 2) dr, 
t 


2. (Continuation of Exercise 1.) Now consider the case k = —1; that is, F is 
represented by 


r(t) = s(t) — { e~ "—s(z) dt. 


Show that @(F) is not L ,(— 00,00) by showing thatif r is in the range of F, then 


lim [” r(t)dt=0. 
M,N-wor—M 


Since A(F) is not L ,(— 00,00), the mapping F cannot be invertible. However, 
if F is one-to-one, it will have a left inverse. Show that 


s(t) = r(t) + { ” Fede 


represents a left inverse of F. 
It happens, perhaps surprisingly, that 


s(t) = r(t) — i r(t) dt 
t 
also represents a left inverse F,_‘. As a matter of fact, 
t fe) 
s(t) =r(t)+ { r(t) dt — B | r(t) dt, 
— oo t 


6 Assume for the present that any required interchange of order of integration is allowable. 


where a and f are any real numbers such that « + 8 = 1, also represents F,~'. 
Show that this is indeed the case. 
What can be said on the subject of a right inverse? 


1, Given the *‘ black boxes” in Exercise 1, we can usually model them using sets 
of functions other than L,(— 00,00). For example, we can often model them 
using L,°(— 00,00), where o is a real number and L,°(— 00,00) is the set of all 
real-valued functions x defined on — oo < ft < oo such that 


[ee] 
{ |x(t)|?e~ 2 dt < 00. 
— 0 


Assume that a mathematical model for T is given by the mapping of 
L,°(— 0,0) into itself for o > —1 represented by 


c(t) = [ e~ “'"9m(t) dt. 


Note that this mathematical model is the same as that used in Exercise | with 
ag = 0. 

Similarly, assume that a mathematical model for K is the mapping of 
L,°(— 0,0) into itself represented by 


m(t) = ks(t). 


Further, assume that K and 7 are interconnected as in Exercise 1. It then 
follows that a mathematical model characterizing the dependence of r on s 1s 
given by 


r(t)=s(t) +k [ e~ “—9s(z) dt, 


where the above represents a mapping F of L,°(— 00,00) into itself foro > —1. 
As before, we are interested in the inverse of F, if it exists. We now have 
two parameters to consider: k and o. 


Case 1. o> —(l+k)ando> —-1. 


Show that F~' can be represented as follows: 
t 
s(t) = r(t) —k { e EDGE Dn(z) der. 
Case 2. —-l<o< -—(1 +k). 
Show that F~! can be represented as follows: 
s(t) = r(t) + k | e~ FED e( 2) dr, 
t 


4. Using Exercise 2 as a model, discuss the case o = —(1 + k) in Exercise 3. 
5. Reconsider Exercises | and 2 on the set L,[0,00). 


38 SET-THEORETIC STRUCTURE 


6. Show that if a mapping 7: X — Y is both left and right invertible, then (i) 7 is 
invertible, (ii) the left and right inverses are unique, and that all three inverses 
T~',T,-', T.~? are the same. 

7. Suppose that F: X— Y and G: Y—Z are invertible mappings. Show that the 
composition GF is invertible and that (GF)~! = F7'!G"}. 


8. SYSTEM TYPES 


There are two concepts from system theory that are used in later examples, 
and it is convenient to introduce them here. 

The first system concept 1s that of time-invariance. We shall discuss this concept 
first for discrete-time (sampled data) systems and then for continuous-time systems. 
In either event, the basic idea is the same. A system is time-invariant if the only 
effect of a translation in time of an input is a corresponding translation in time of 
the output. That is, if y(t) is the output associated with an input x(t), then )(¢ + 7), 
where t is a constant, is the output associated with the input x(t + t). Let us make 
this precise. First we consider the discrete-time case. 

Let X denote a set of a doubly infinite sequences x = {..., X¥_1,X0,X,,...}. 
The only requirement we place on X is that if x = {x,} is in X, then so is every 
translate of x. That is, if {x,} eX and N is an integer (positive or negative), then 
{y,} € X, where y, =X,4y,n=..., —1,0,1,.... Briefly, X is closed under trans- 
lations. 

The sequences in X can be viewed as inputs and outputs to discrete-time 
systems, and time can be assumed to increase with the index; that is, x, comes 
before x,. 

Let S,: X— X be the right shift (or translation) defined on X. In particular, 
1 XS 4 cou XG Xo aX anes) ANd So Se Vig Vo sVidsesyy CHER Dp = Xa, for 
k=..., —1,0,1,.... The composition 


Sp 5,5, °° S, 
: 
n times 


is denoted S,”. The shift S, is clearly invertible, so S,” is meaningful when n is a 
negative integer. Moreover, S,° = J, the identity. 
We are now ready to define time-invariance on X. 


2.8.1 DEFINITION. A mapping T of X into itself is said to be time-invariant if 
SS.) =1s, (2.8.1) 


for all integers n. 


If x is an input and y = Tx is the associated output, S,"x is the input x shifted 
in time and 7'S,"x is the output associated with this shifted version of x. S,"Tx is 
a shifted version of y. Equation (2.8.1) states that this shifted version of y equals 
the output associated with the shifted input. 


2.8. SYSTEM TYPES 


2.8.2 THEOREM. A mapping T: X > X is time-invariant if and only if 


S.T =TS,. 


39 


Proof: If Tis time-invariant, we clearly have S,7 = 7TS,. Going the other 


way, if S,7 = TS,, then 
ST es S,(S,T) = (S, T)S, _ (TS,)S, = TS* 
and so forth. J 


EXAMPLE 1. Let X =/,(— 0,00), the set of all doubly infinite sequences 


x= {, ee XK X44. eee Xj —-19%;5 oe } such that 
ye? 00. 
Let 7 denote a mapping of X into itself that can be represented by 


00 


> ha-jyXj> ee en Ore 0 ee oe 


(TX), = 


where 


Tx = {..., (Tx) (TX) p41, 6-5 (I) 5-1 (T%);,--.3 
and h is a sequence 
h = {... hog sh—45ho phy hz hg, ...}. 
Then 
(TS, x), = 3 hap Xj—4 


and 


re.6) 


(S,Tx),= D) ha-1-j%y- 


j=- 


Letting 7 = i — 1 in the last equation shows that 7S, = S,7. So Tis time-invariant. § 


Next we define time-invariance for the continuous time case. 


Suppose Z is a set of functions defined on the real line, (— 00,00). Similarly to 
the discrete-time case we require that if x eZ and y(t) = x(t + 1), where t is a 
constant, then y eZ. The shift by t, denoted S,, is the mapping of Z into itself 


defined by 
(S,x)(t) = x(t + 7), tEé(— 00,00). 


2.8.3 DEFINITION. A mapping 7 of Z into itself is said to be time-invariant if 


S.T=TS, 


for all —c0 <t< ©. 


40  SET-THEORETIC STRUCTURE 


EXAMPLE 2. Consider Z = L,(— 0,00) and let T be a mapping of Z into 
itself represented by a convolution integral of the form 


(Txy(t) = { een ore 


— 0 


Then 

(TS.x)(t) = “ht — s)x(s + 2) ds 
and 

(S.Tx)(t) = { “ht 4+ —s)x(s) ds. 


By letting s=6@-+ 7 in the last equation shows that $,7 = 7S, for arbitrary t. 
Hence, 7 is time-invariant. J 


A mapping 7 that is not time-invariant is said to be time-varying. 

Causality is the next systems theory concept we consider. Roughly speaking, 
a system is causal if past output is independent of future inputs. 

Let Y be a set of doubly infinite sequences similar to the set X described above. 
We no longer require, however, that Y be closed under translations or shifts. 


2.8.4 DEFINITION. A mapping T of Y into itself is said to be causal if for 
each integer N whenever two inputs x = {x,} and y = {y,} are such that x, = y, 
for n < N, it follows that (Tx), = (Ty), for n < N, where 


Tx = {...,(Tx)_1(7X)o (Tx), (7Tx)>,...} 
and 


Ty = {...(Ty)-1(TY)o (Ty): (Ty)2, «+ -}. 


In other words, if the inputs x and y agree up to some time N, then the outputs 
Tx and Ty agree up to time N. In particular, 7x and Ty agree up to time N no 
matter what the inputs x and y are in the future beyond N. 


EXAMPLE 3. It is a simple matter to show that in Example | the operator 
T represents a causal system if and only if hy,_;) = 0 for j > k. Indeed, in that case 
we have the Volterra equation 


k 
(Tx),=  ha-jXp 
J=—@ 
showing that (7x), is independent of x,41;,X,42,andsoon. J 


Causality for continuous-time systems is defined in a similar manner. Let U 
be a set of functions defined on the real line. 


2.8. SYSTEM TYPES 4] 


2.8.5 DEFINITION. A mapping F of U into itself is said to be causal if for 
cach time 7, one has (Fx)(t) = (Fy)(t) for t < T, whenever x(t) = y(t) for t < T. 


EXAMPLE 4. In the case of Example 2, the operator T represents a causal 


system if and only if h(t — t) = 0 for (almost) all (t — t) < 0. We can write 7 as a 
Volterra integral 


t 
(Tx)(t) = | h(t — s)x(s) ds, 
showing that (7x)(t) is independent of x(s) fors>t. J 


There is a great deal more that can be said about the concepts of time- 
_ invariance and causality. Indeed, examples which appear later will do just that. 
[fowever, for the moment we will let the matter rest. 


KXERCISES 


|. Is the digital system described in Exercise 3 of Section 6 causal? 


2. A system is sometimes said to be anticausal if future output is independent of 
past input. For example, let F: L ,(— 0,0) > L,(—o,00). Then F is anti- 
causal if for each time T one has (Fx)(t) = (Fy)(t) for t > T whenever x(t) = y(t) 
for t>T7. The definition is analogous for sampled data (or discrete-time) 
systems. Which of the following integral operators represents an anticausal 
mapping of L,(— 00,00) into itself: 


(a) y(t) =f “9 x(t) dr. 


(b) y(t) = { e © O"'x(a) de. 


(c) y(t)= { e~ —)x(z) dt. 


4, Consider the collection X of all mappings of L,(— 00,00) into itself. Let @ < X¥ 
denote the causal mappings, and let ~@ < X denote the anticausal mappings. 
Show that @ 1 @ is not empty. Mappings in this intersection are referred to 
as memoryless. 


4, Suppose that Y is a set of doubly infinite sequences y = {...,V_4,V0V4iVo.---h 
and that T is an invertible mapping of Y onto itself. Show that the inverse of T | 
is causal if and only if Tx = {...,(7x)_1,(Tx)> (7T),,...} and Ty = {...,(Ty)_1, 
(Ty) (Ty),,...} are such that if (7x), = (Ty), for n< N then it follows that 
x, = y, for n<N, where x = {...,X¥~1,X%9,X,,%2,-.--} and y= {...,V-1,Vo Vi 
y2,...}. Carefully note that the causality of T~’ is independent of the causality 
of T. 


5. Which of the mappings in Exercises 1, 2, 3, and 4 of Section 7 are causal ? 


42 SET-THEORETIC STRUCTURE 


SUGGESTED REFERENCES 


Halmos [5] 
Hausdorff [1] 
Wilder [1] 


Topological 1. Introduction 44 


Part A Introduction to Metric Spaces 45 
Structure 2. Metric Spaces: Definitions 45 

3. Examples of Metric Spaces 47 
4. Subspaces and Product Spaces 56 
5. Continuous Functions 61 
6. Convergent Sequences 69 
7. A Connection between Continuity 

and Convergence 74 
Part B Some Deeper Metric Space 
Concepts 77 
8. Local Neighborhoods 77 
9. Open Sets 82 
10. More on Open Sets 92 
11. Examples of Homeomorphic 

Metric Spaces 97 
12. Closed Sets and the Closure Operation /0/ 
13. Completeness 112 
14. Completion of Metric Spaces 120 
15. Contraction Mappings 125 
16. Total Boundedness 134 


17. Compactness 14] 


1. INTRODUCTION 


The word “‘topology’’ means literally “‘the study of places’’ or “‘the study 
of localities.”’ This subject is a natural outgrowth of Euclid’s study of plane geom- 
etry. Algebraic topology, combinational topology, differential topology, and 
point-set topology are all branches of this relatively young subject, with very few 
results dating before 1850. ~ 

The reader may have heard of some of the famous problems of topology, for 
example: the Four Color Problem, the Seven Bridges of K6nigsberg Problem, 
and the Cranky Neighbor Problem. The last two of these problems have been solved 
using topological methods, whereas the Four Color Problem is perhaps the most 
famous outstanding problem in topology. 

In this chapter, and in the remainder of this book, we shall concentrate on one 
branch of topology, namely, point-set or general topology. By the end of this book, 
we hope the reader will appreciate the central role this branch of topology plays 
in modern analysis. 

The basic geometric concept underlying the subject of point-set topology is 
the notion of “‘closeness”’ or ‘“‘nearness.’”? For most purposes the concept of 
closeness associated with a metric will be sufficient. Hence in this book and in this 
chapter, particularly, we shall study the topology of metric spaces. 

This chapter is divided into two parts. The introductory metric space defini- 
tions and concepts are given in Part A, and a deeper discussion of metric space 
structure is given in Part B. 

We recommend that the reader study Part A first and then go on to the rest 
of the book. The material from Part B can be filled in as needed. 


Part A 


introduction 
to Metric Spaces 


2. METRIC SPACES: DEFINITION 


A metric space 1s a pair of objects: a set, say X, and a metric—or distance 
function—d(x,y). More precisely, we say that the pair (X,d) is a metric space if 
X is a set, called the underlying set, and d(x,y) is a real-valued function, called the 
metric, defined for x, y€ X and satisfying the following conditions or axioms: 


(M1) (Positive) d(x,y) > 0 and d(x,x) = 0 (for all x and yin X). 

(M2) (Strictly Positive) If d(x,y) = 0, then x = y (for all x and yin X). 
(M3) (Symmetry) d(x,y) = d(y,x) (for all x and y in YX). 

(M4) (Triangle Inequality) d(x,y) < d(x,z) + d(z,y) (for all x, y, and z in X). 


All four of the above conditions are in harmony with our concept of distance 
in the Euclidean plane. Indeed, the distance function 


d(x,y) = [(x, — 1)? + (x2 -— y2)7 1” 


discussed in Section 1.2 satisfies each of these properties. The reason for calling 
(M4) the triangle inequality is illustrated in Figure 3.2.1. The vertices of the 


Figure 3.2.1. 


(riangle denote points x, y, and z in X, and the sides of the triangle have lengths 
d(x,y), d(x,z), and d(z,y). The triangle inequality (M4) is, then, an abstract 
formulation of the triangle inequality of Euclidean geometry. It is very important 
(o include (M4) as one of the defining properties for a metric. We invite the reader 
(o watch for it as it is used time and time again in this chapter and later in the book. 


46 TOPOLOGICAL STRUCTURE 


It can happen that a nonempty set XY may have more than one metric defined 
on it. For example, if X¥ = R’, 


d,(x,y) = [(x, — y,)? + (x2 — y2)*]'” 
or 


d,(x,y) = |x, — ¥,| + [x2 — yal 


are metrics on the set X. Hence, (X,d,) and (X,d,) are metric spaces. More 
importantly, they are different metric spaces even though they have the same 
underlying set X. In general, any nonempty set with more than one element can 
have an infinite number of metrics defined on it. Indeed, if dis a metric on X, then 
d (x,y) = ad(x,y), « > 0, is a metric on X. Other, less trivial examples abound. 

A metric space (X,d) is, then, a set X with an additional structure defined by 
means of a metric function d. This additional structure is called the topological 
structure. The bulk of this chapter is devoted to a study of this structure. 

When no confusion can arise we will often denote the metric space (X,d) by X. 

If A is a nonempty subset of a metric space (X,d) and x is a point in (X,d), 
then we say that the distance between x and the subset A, denoted d(x,A), is given 
by the real number 


d(x,A) = inf{d(x,y): y € A}. 
We say that the diameter of the nonempty subset A, denoted diam(A), is given 
diam(A) = sup{d(x,y): x,y € A}. 


Note that diam(A) may be +00. A set A is said to be bounded if its diameter is 
finite. 


The following lemma is a simple consequence of the triangle inequality. We 
ask the reader to verify it. 


3.2.1 LEMMA. Let (X,d) be a metric space. For any three points x, y, z in X 
one has 


|d(x,y) — d(y,2z)| < d(x,z). 
EXERCISES 
1. Let d be a metric on X and let d,(x,y) = ad(x,y), where 0 < « < 1. Show that 


d,# dif X has two or more points. 


2. Show that d(x,y) = |x — y| is a metric on the real numbers R; on the complex 
numbers C. 


3. Let d(x,y) be a metric on X. Show that 
d(x,y) 


d,(x,y) = Ta dey) and =e d(x, y) = min(1, d(x, y)) 


are also metrics on X. Show that every set in the metric space (X,d,) (¢ = 1,2) 
is bounded. 


3.3, EXAMPLES OF METRIC SPACES 47 


4. A real-valued function p(x,y) is said to be a pseudometric on X if it satisfies 
conditions (M1), (M3), and (M4). 
(a) Show that p(x,y) = 0 is a pseudometric on any set X. 
(b) Show that p{(x,,x>), (¥1,¥2)} = |x, — y,| is a pseudometric in the plane R?. 


5. Show that if A is nonempty, in a metric space (X,d), then diam A = 0 if and 
only if A consists of a single point. Is this true in a pseudometric space? 


3. EXAMPLES OF METRIC SPACES 


The reader may wonder how metric spaces arise in practice. In some cases 
they arise quite naturally with the statement of a problem. For example, in survey- 
ing a flat piece of ground our set X is naturally chosen to be the set of all points in 
the area under consideration and our metric is naturally chosen to be the usual 
distance function. In other cases metric spaces are chosen on the basis of mathe- 
matical convenience. It often happens that the metric space which appears to be 
(he most natural for a given problem leads to intractable mathematical difficulties. 
Consequently, we compromise and choose a metric space which is not as naturally 
related to the problem as we might desire but which does lead to a tractable 
mathematical setting. For example, the great use of square-root-integral-square 
und square-root-sum-square criteria, that is, criteria of the form 


{{ Ix) — yO? ae 


a 


and 
{> |x; = yil7})? 


throughout engineering and science is often more a result of a desire for mathe- 
matical convenience than of the universal naturalness of such criteria. 

Let us consider some examples of metric spaces. Some of these examples are 
intended as illustrations of mathematical points while others are intended to 
illustrate the occurrence of metric spaces in applied mathematics. 

Since some of these examples may be familiar, there is a possible source of 
confusion. In meaningful applications we seldom deal with sets which have a 
inetric space structure only. Usually another structure is present. Furthermore, the 
ndditional structure may be so familiar that it may require a conscious effort to 
ignore it. Thus, it is possible to confuse the metric structure of an example with 
the other structure which may be present. We caution the reader to be on his guard 
lgainst this source of confusion. 


EXAMPLE 1. We have already delved into the structure of the real Euclidean 
plane at some length in Section 1.2. Since it was our prototype, we are not surprised 
(hat it has enough structure—in fact, more than enough—to make it a metric space. 

We mentioned the possibility that more than one metric could be defined on 
some sets. This is not only possible, it is quite common on R?. 


48 TOPOLOGICAL STRUCTURE 


; Let x =(x,,x2) and y =(y1,y2) be points in R?. The Euclidean metric on 
R* is 


A(x,y) = Lx, — y)° + (x2 - yo) J". 


However, there are many other metrics for R?. In particular, we claim that in each 
of the following cases the function d(x,y) defined below is a metric on R?: 


(a) d(x,y) = d,(x,y) = |x, — yy] + [x2 — yal, 
(b) d(x,y) = d,.(x,y) = max{ |x, — y,|, x2 — yal}, 
(c) d(x,y) = d,(x,y) = { |x, — yl? + [x2 — yal?})”, 


where 1 < p < © (p =2 corresponds to the Euclidean metric; p = | corresponds 
to (a); and (b) is sometimes called the p = oo metric). 


(d) d(x,y) = {a|x, - y,! + b|x, — yx) |%2 — y2| + elx2 - Oa ae where a > 
0, c> 0, and 4ac — b? > 0. 


The proof that (a), (b), (c), and (d) are indeed metrics is left as an exercise. 
Although these examples yield a rich supply of metrics on R’, it should not be 
imagined that we have come close to exhausting the supply. (What are some others ?) 

Each of the above metrics yields a different metric space even though the 
underlying set R? is always the same. We will see subsequently, when we discuss 
open sets, that these metric spaces are “‘ equivalent” to one another in an important 
and useful sense. Although we are unable to discuss this concept of equivalence 
now, let us hasten to add that not all metric spaces with R? as the underlying set 
are equivalent to one another. For example, the function d(x,y) = 0, if x = y, and 
d(x,y) = 1, if x # y, defines a metric on R? which yields a metric space which is 
not equivalent to any of the preceding examples. J 


EXAMPLE 2. Let X be the set made up of all ordered n-tuples of real numbers, 
x = (x,,...,X,). We shall write this as X = R". The following are metrics on R". 


(a) d,(x,y) =Elx, -—yyl? t+ + 1x, —yil?]'?, where 1 <p<oo. 
(b) d..(x,y) = max[ x, = Yl, ceey Xn om Yall. 


It is easy to show that (b) is a metric, and it is easy to show that (a) satisfies condi- 
tions (M1), (M2), and (M3). Showing that (a) satisfies the triangle inequality, 
condition (M4), follows from the Minkowski Inequality, see Appendix A. The 
function d, is sometimes referred to as the /,-metricon R". Jj 


EXAMPLE 3. Let X be the set made up of all ordered n-tuples of complex 
numbers, x = (x,,...,X,). We shall write this as X = C". What would be the 
analogs to (a) and (b) of Example 2 in this case? J 


3.3. EXAMPLES OF METRIC SPACES 49 


EXAMPLE 4. Let /, = /,(0,00) be the set made up of all infinite sequences 
of real (or complex) numbers, x = (x,,x2,...), such that the series )2, |x;|? 
converges, where | < p < 0. We claim that the function 


d,(x,y) = by Ix; — vi | ™ 


defines a metric on /,. It follows from the Minkowski Inequality for infinite sums 
(Appendix A) that the series defining d,(x,y) always converges for x and y in J,. 
Furthermore, we see that property (M4) for a metric is satisfied. The other three 
properties (M1), (M2), and (M3) are easily verified. Hence (/,,d,) is a metric 
space for each p, 1 < p < o. Obviously, different p’s yield different metric spaces.’ 
(Do any two of these metric spaces have the same underlying set?) 

Such sequence spaces occur throughout engineering and science. For example, 
the mathematical model for the set of all inputs to a sampled-data control system 
is often taken to be (/,,d,). Another example is to consider a typical point in 
(1, ,d,) as the sequence of coefficients associated with the modes of vibration of a 
mechanical system which has an infinite number of modes (for example, a plucked 
string). 

Since these spaces occur so often we shall refer to them as /, “‘ with the usual 
metric” or sometimes simply as/,. J 


EXAMPLE 5. Let /,, be the set of all bounded sequences of real (or complex) 
numbers. That is, a sequence {x,} is in /,, if and only if there exists a real number 
M such that |x,| < © for all n. For example, the sequence {a,a,a,...}, where « 
is a real number, is in /,,; whereas the sequence {1,2,3,4,...} is not. Let 


d,.(x,y) = sup IX, = Yrl- 


It is easily shown that (/, ,d,,) is a metric space. This space is a generalization of 
(b) in Example 2. Note that here ‘“‘sup”’ is used instead of “‘ max.”’ 
This space is also referred to as /,, with the usual metric. 


As in Example 4 we shall use /,,(0,00) to denote the /],-space of ordinary 
Sequences xX =(X,,X2,...) and /,(—0o,00) to denote the /,-space of doubly 
infinite sequences x =(...,X_1,X0,X,,..-). J 


EXAMPLE 6. Let B be the set of all infinite sequences of positive integers 
n= {n,,n,,...}, and let d be defined by 


0, ifn; =m, fori=1,2,... 


d(n,m) = 7 where & is the first index for which n; 4 m;. 


1 We sometimes are interested in the space of doubly infinite sequences x = (...,X_1,X0,%1,..-) 
With Le — ao |Xn/? < 00. We shall denote this space by /,(— 00,00). The metric is then defined 
by d,(x,y) 7 (une - 0 [Xn Yal?)t/?, 


50 TOPOLOGICAL STRUCTURE 


(B,d) is a metric space called the Baire null-space. It is interesting to note that this 
metric satisfies a stronger triangle inequality, namely, 


d(n,m) < max{d(n,p), d(p,m)}. 


Just to see that rather “‘ weird looking’”’ metrics can arise in practice, consider 
the following use of the Baire null-space as a mathematical model. Suppose that a 
function s(t) is a signal to be sent through a communication system and that s(t) 
is (1) sampled every second, (2) quantized, and (3) the quantization levels are 
encoded as positive integers. This processing of s(t) is illustrated in Figure 3.3.1. 


3 yal 
0 ] 2 3 4 5 6 7 8 9 l 


Time in Seconds 


Figure 3.3.1. 


Quantized Levels 


ww Fan ws wf 0 


0 


The sequence of positive integers representing the signal s(t) in Figure 3.3.1 
would be 


n, = {0,1,3,5,6,7,7,8,8,7,...}. 


Further, suppose that in being processed by the remainder of the communication 
system, disturbances are introduced which tend to accumulate and eventually 
cause incorrect quantization levels to be received. If n, = {n,,,n,,,...} denoted 
the received sequence associated with n,, it is of interest to know how long the 
system runs before an error in quantization level occurs. Thus we can use the 
Baire null metric, d(n,,n,), to characterize the distance or difference between the 
sent and the received sequence. The smaller d(n, ,n,) is, the longer the communica- 
tion system runs before an error occurs. J 


EXAMPLE 7. Let X, = Q, the set of all rational numbers, and let d,(x,y) = 
|x — y|. Similarly, let X, = R, the set of all real numbers, and d,(x,y) = |x — yl. 
The metric spaces (X,,d,) and (X,,d,) are different, and they differ from one 
another in an important way which will become clear when we discuss complete 
metric spaces (Sections 13 and 14). Jj 


3.3. EXAMPLES OF METRIC SPACES 51 


Now let us consider some metric spaces whose underlying sets are sets of 
functions. 


EXAMPLE 8. Let X¥ = C[0,T] be the set of all real-valued (or complex-valued) 
continuous functions defined on the interval [0,7], where T>0. We define a 
metric on C[0,7] called the sup-metric, by 


d,.(x,y) = d(x,y) = sup{ |x(t) — yO]: 0 <1 < T}. 


I.et us show that the sup-metric satisfies the triangle inequality. Let x, y, and z 
be arbitrary elements of C[0,7], then 


d(x,y) = Supe) — z(t) + z(t) — y(t)| 
< sup _{Ix(t) — z(t)| + |z(t) — yO} 
= sup _[x(6) — z(t)| + oe — y(t)| 


= d(x,z) + d(z,y). 


The remaining conditions on a metric are easily shown to be satisfied; hence, 
(x,y) is a metric and (C[0,7], d) is a metric space. This metric space is often 
simply denoted by C[0,7]. J 


EXAMPLE 9. Let X¥ = C[0,7T] and now define 
T 1/p 
dy(oy) =| f ixc)— ycoy” ae] 


where | < p < oo. By using the Minkowski Inequality for integrals (Appendix A) 
It 1s easily seen that d, defines a metric on C[0,7]. We note that when p = 2, the 
tnetric is the square-root-integral-square criterion referred to earlier. J 


EXAMPLE 10. Let X = L,(/), 1 < p < 00, be the set ofall real- (or complex-) 
vilued functions x(-) defined on the interval J such that 


i x(t)? dt < 00, 
I 


where the integral is the Lebesgue integral (see Appendix D). Define the metric 
on Li) =L, by 


dy(oy) = | f tate) — vcore ae] 


This function d, is referred to as the ‘‘ usual metric” on L,(J). 
There is a corresponding space L,(/). This consists of all real (or complex) 
vilued functions x(-) defined on the interval J such that 


|x||,, = ess sup |x(t)] < oo. 
tel 


52 TOPOLOGICAL STRUCTURE 


This means (see Section D.11) that a function x(-) belongs to L,,(/) if and only if 
there is a real number N with the property that 


|[x(t)| < N (almost everywhere). 
In this case ||x||,, is then given by 
|x|, = inf{B: |x(t)| < B (almost everywhere)}. 
The ‘‘ usual metric” on L,(/) is then given by 


d..(X,y) = ||x — yllo- 


This example raises a technical problem which occurs often with function 
spaces defined in terms of integrals. As it stands, d, does not define a metric on 
L,, for we can easily find an x and y in L, such that x # y but d,(x,y) =0. For 
example, let x(t) = y(t) for all t except the point t= 1 and let x(1) = y(1) + 1. 
Clearly x and y are different functions and d,(x,y) = 0. In other words, d, is a 
pseudometric (compare with Exercise 4, Section 2) and not a metric. There are 
several ways of “changing” d, into a metric. We adopt the following point of view: 

We define a new equality between functions in L,. We say that x = y if and 
only if d,(x,y) = 0. We see (Appendix D) that x = y (in the new equality) if and 
only if x(t) = y(t) everywhere except perhaps on a set of measure zero. 

This new equality does turn d, into a metric and (L,,d,) into a metric space. 
However there are several hidden defects that come with this point of view. First, 
we can no longer distinguish between functions that differ on finite sets, countable 
sets, or more generally, sets of measure zero. Secondly, a technical problem arises 
when one asks whether a given property, that is valid under the old equality, is 
now valid under the new equality. We shall deal with these difficulties as they arise. 

The reader will see that they do not present serious obstacles. (See Example 8, 
Section 2.5). Jj 


EXAMPLE 11. Let X = BC(J), where BC(/) is the set of all real-valued (or 
complex-valued) continuous, bounded functions x(t) defined on the finite or 
infinite interval J. Recall that a function x is bounded on / if there exists a real 
number M (the number M depends on the function x) such that |x(t)| < M on J. 
Let the metric on BC(/) be the sup-metric, which is given by 


d(x,y) = sup{|x(t) — y(@)|: ted}. | 


EXAMPLE 12. Let X be the set of all continuous functions x(t,s) defined on 
some closed, bounded set D in the (¢,s)-plane. Let the metric on X be the sup- 
metric: 


A(x, y) = sup{ [x(7,5) i y(t,s)|: (t,s) E D}. B 


EXAMPLE 13. Given a real number o let C, denote the vertical line in the 
complex plane defined by 


C, = {s: Re(s) = a}. 


3.3. EXAMPLES OF METRIC SPACES 53 


Imaginary or 1W-ax1s 


otiw 


Complex Plane 
o+i0 


Real or o-axis 


Co 
Figure 3.3.2. 


let L, .(—ioo,ioo) be the set of all complex-valued functions x(s) defined on 
(', such that 


I 2 
oF ao) ds < 0. 


l)cfine the metric on L, ,(—ioo0,i00) by 


dcxy) = (5 f Ix)— rio? as} 


lhe metric spaces (L, ,(—io0,io0), d,) play an important role in Fourier and 
| uplace transform analysis. 

The space L, ,(—ioo,ico) that arises when o = 0 (that is, when the path of 
in{egration is the imaginary axis) will occur often in examples and exercises in this 
look. We shall denote this space simply by L,(—ioo,ioo). J 


EXAMPLE 14. Let C, be defined as in Example 13, and let Y, be the set of 
wll continuous and bounded complex-valued functions y(s) defined on C,. Define 
the metric on Y, by 


d,(x,y) = sup{ |x(s) — y(s)|: 5 € C,}. 
Ihe metric spaces (Y,,d,) also play an important role in Fourier and Laplace 
transform analysis. J 


EXAMPLE 15. Let X = C(—o,0) be the set of all continuous, real- (or 
complex-) valued functions defined on the real line R = (— 0,00). Define 


PAX,Y) = sup{|x(t) — y(t)[: |t| < n}, 
6,(x,y) = min{1, p,(x,y)}, 


00 


1 
o(x,y) = 2 an 6,(x,)). 


54 TOPOLOGICAL STRUCTURE 
Then (X,c) is a metric space. If we let 
1 
(x,y) = sup{min] =, sup Ix(#) — »(ol]}, 
O<T TO oitt<t 
then (X,d) is another metric space. J 


EXAMPLE 16. This example may appear trivial, yet it does have some mathe- 
matical importance and there are situations where it is an appropriate mathematical 
model. Let X be any nonempty set, and define d(x,y) by 


_f0, ifx=y 
COO Ni: GP sa 


It is a simple matter to show that (X,d) is a metric space. An example of a physical 
situation where (X,d) may serve as a mathematical model arises when X is a set 
of targets. If we miss the intended target, hitting some other one, we say that the 
miss distance or cost is 1, that is, all misses are equally weighted. If we hit the 
desired target, the miss distance or cost is zero. J 


EXAMPLE 17. Let X(n) be the set of all ordered n-tuples of “zeros” and 
“ones.” For example, 


X(3) = {000,001,010,011,100,101,110, 111}. 
For x and y in X(n), let 
d(x,y) = number of places where x and y have different entries. 
For example, in X(3) 
d(110,110) = 0 
d(010,110) = 1 
and 


d(101,010) = 3. 


We leave it to the reader to show that (X(n),d) is a metric space. This metric space 
occurs in switching and automata theory. J 


EXERCISES 


1. For many of the examples of metric spaces given in this section, no proof was 
given that the claimed metric space was indeed a metric space. Supply the 
missing proofs. 


2. Describe a mathematical model using each of the metric spaces given in the 
examples. 

3. For each metric space given in Example |, sketch the set of all points x such 
that d(x,yo) < 1, where yo = (0,0). 


3.3. EXAMPLES OF METRIC SPACES 55 


4. A warehouse stores apples, rope, gasoline, sand, and plate glass. What are 


10), 


some of the ways that one might use a metric space as a mathematical model 
to characterize differences in warehouse content? 


. Let X be the set containing the cities Los Angeles, New York, Chicago, and 


San Francisco. How might an air traveler place a metric on this set? A rail 
traveler? An automobile traveler? A truck driver? A yacht owner? 


. Let X¥ = R? and let 


d(x,y) = {xy — ys|"/? + [x2 — yal? }?. 
Is {X,d} a metric space? 


. In R? let A = {x = (X;,X>): (|,|7 + |x2|7)'/* < 1}, and 


d(x,y) = (|x, — Yl? + [x2 — yal?}'?. 
Compute d(x,A). Show that d(x,A) = 0 if and only if 


2 
Jay]? + [x2]? < 1. 


. In Example 5, the metric d,, was defined with a “sup” instead of a “‘ max.” 


In order to see the necessity of this let x = (x,,x2,...), y=()1,)2,-.-) be 
given by 


and Vz= 


Show that d(x, y) >|x, —y,| for all n= 1. 
On the other hand, in Example 8 “sup” could be replaced by “max.” We 
will see why in Section 17. What about Example 11? 


. Sketch the set of points x = (x,,x,) in R? for which 


d,(0,x) = 1, 


where 0 = (0,0) and d, is defined in Example 1. With x and y fixed show that 
d,(x,y) is decreasing in p. Hence 


d(x,y) 2 d(x,y) 
whenever p < q. 


Use the techniques of Exercise 9 to show that whenever p < q one has d,(x,y) => 
d{x,y) on R’. 


. Let C be the unit circle in the complex plane, that is, C = {z: |z| = 1}. 


Let X denote all complex-valued functions f(z) defined on C for which 
Sc | f(z)|? |dz| < 00. Show that 


acpa) = ([.\72)— ater? idl)” = ([,"1ste%) aCe? a) 


is a metric on X. 


56 TOPOLOGICAL STRUCTURE 


12. Let X denote the class of all complex-valued functions f(z) that are analytic 
for |z| < 1. Let f(z) = oo G42", 9(Z) = Dn 0 5,2" be the power series expansion 
for f, g in X. Show that the following functions are metrics, or pseudo-metrics. 
(a) d(f.g) = sup{ |f(z) — 9(z)|: |2| <p}, whereO<p <1. 

(b) d(f,9) = |do — bol. 
(c) Af.9) = ldo — bol + la, — dy]. 
(d) d(fig)=) ola, — 5,| ep”, where 0 <p <1. 


() d(f.g) = sup| fa. f= a6 2 


13. Let S be a nonempty set and let X = B(S) denote the collection of all bounded, 
real-valued functions defined on S. Show that 


d(f,9) = supi |f(s) — g(s)|: se S} 


a4 <p}, where 0 < p < 1. 


is a metric on X. 


14. Let X(n) denote the collection of all differential operators of the form 
P(D) = p,D" + Py-,D"~' a aed + p,D + Po; 


where the coefficients p; are real constants. Show that 
a(P(D), Q(D)) = ¥. |p. - ai 
is a metric on X(n), where P(D) is given above and 
Q(D) = ¥ aD 


Compare this metric space with the space (R"*’,d,) discussed in Example 2. 


15. Let X denote the collection of all bounded closed intervals [a,b] from the 
real line. Let [a,b] and [c,d] be two sets. Show that the symmetric dif- 
ference [a,b] A[c,d] is the union of (at most) two bounded intervals. Define 
p([a,b], [c,d]) to be the sum of the lengths of these intervals. Show that p is a 
pseudometric on X. Discuss the relation p([a,b], [c,d]) = 0. 


4. SUBSPACES AND PRODUCT SPACES 


In the last section we examined several examples of metric spaces. These 
examples by no means exhaust the supply of known metric spaces. If we are given 
some metric spaces, there are several ways of constructing new metric spaces. In 
this section we shall discuss two such methods, namely, subspaces and product 
spaces. 


Subspaces 


Let (X,d) be a metric space and let A be a subset of X. We can use the metric 
d to define a metric on A in a very natural way. If x and y are in A we simply let 
the distance between x and y be given by d(x,y). Obviously, (A,d) 18 a metric space. 


3.4. SUBSPACES AND PRODUCT SPACES 57 


Since A <_X, we say that (A4,d) is a subspace of (X,d), or the topological structure 
on A is the structure inherited from (X,d). Needless to say, there are usually many 
metrics that can be placed on A; however, when we say that (A,d) is a subspace 
of (X,d) it is understood that the inherited metric is implied. 

If (A,d) is a subspace of (X,d) and A # X, we sometimes emphasize the fact 
that A is not all of X by saying that (4,d) is a proper subspace of (X,d). 

Let us consider a few examples of subspaces and their applications. 


EXAMPLE 1. Suppose we are interested in a vibrating string as shown in 
igure 3.4.1. Let C[0,L] be the set of all real-valued, continuous functions z 
defined on the closed interval 0 < x < L, and let 


A(z, ,Z2) = sup{|24(41,x) — 22(t1,x)|:0< x < L}. 


z(t,,x) = Deflection of the string at 
time f, as a function of x 


Figure 3.4.1. 


Ilere we view the deflection of the vibrating string at time ¢, as a point z(t,, :) in the 
metric space {C[0,L],d}. The motion of the vibrating string can be viewed 
Ws a curve z(t,-) in {C[0,L],d} parameterized by ¢t. However, not every point in 
{C[0,L], d} can lie on this curve, for, as can be seen in Figure 3.4.1, the boundary 
conditions must be satisfied. In particular, the deflections at x = 0 and x = L must 
ulways be zero. Let A be the subset of C[0,L] made up of all real-valued, continuous 
functions z such that z = 0 at x = 0 and x = L. Obviously, (4,d) is a subspace of 
the metric space {C[0,L],d}, and all possible deflections of the vibrating string are 
contained in this subspace. J 


EXAMPLE 2. Consider the following system of differential equations 


ax, ¥ 

— = 44 4X%1 + A422, 

dt 11%1 12 X2 

ax : (3.4.1) 
—— = 04,X, + 422 X>. 

dt 21%1 22X2 


This system of equations can be the mathematical model for many systems. For 
example, it can be used as a mathematical model for the electrical network shown 
in Figure 3.4.2. In this case, a,; = —(R, + R2)/C,, ay. = R2/C,, a2, = R2/C2, 


58 TOPOLOGICAL STRUCTURE 


R, 
R, C, C, R3 
Figure 3.4.2. 
and ad,, = —(R,+ R;)/C,. Moreover, x, and x, are the voltages across the 
capacitors C, and C,, respectively. 
Let us assume that a,, =a,, = —2 and a,,=a,, =1. Then the general 


solution of (3.4.1) is 


x4(t) = 4(e* +e **)x,(0) + 4(e' — e **) x2 (0), 
x2(t) = (e7' — e **)x,(0) + $(e7' + e *)x2(0), 
where x,(0) and x,(0) are the initial conditions. 


Let X be the set of all ordered pairs (x,,x,) of bounded, continuous, real- 
valued functions defined on the interval 0 < t < o. Let 


(3.4.2) 


d(x,y) = sup [x1(t) — yi(2)| + SUP Ix2(t) — y2(2)]. 


We leave it to the reader to show that (X,d) is a metric space. 

For a given pair of initial conditions, x,(0) and x,(0), the ordered pair of 
functions given in (3.4.2) is a point in the metric space (X,d). Moreover, if A is the 
set of all solutions of (3.4.1), that is, all the ordered pairs given by (3.4.2), A isa 
set in (X,d). 

Obviously, (A4,d) is a subspace of (X,d). Is A a proper subset of X or does 
A=xX? Jj 


Product Spaces 


The concept of product space is a little more complicated than that of a 
subspace. Let (X,d,) and (Y,d,) be two metric spaces. Recall (Section 2.3) that 
the product set Z = X x Y is defined as the collection of all ordered pairs (x,y) 
with x e X and ye Y. Using the metrics d, and d, we can define a metric on Z in 
several ways. Letting u = (x,,y,) and v = (x,,y,) be elements of Z, the following 
functions are metrics on Z: 


(a) d,(u,v) = d,(%1,%2) + 4(1,y2). 
(b) d,(u,v) = [d,?(x1,%2) + dy(1,¥2)]". 
(c) d,(u,v) = [dP(%4 ,X2) 2 q(4 y2))'””, l Ss p<. 


3.4. SUBSPACES AND PRODUCT SPACES 59 


(d) d..(u,v) = max {d(x Xo), A,(¥1,V2)}- 


(e) d(uv) = [ad,?(x,,x2) + bd,(x, X2)dy(V1,¥2) + cd,*(y, .y2)]'/7, where 
a>0,c>0, 4ac — b? >0. 


(Compare this with Example 1 of Section 3.) 

The set Z = X x Y with any of the above metrics is referred to as the product 
space of (X,d,) and (Y,d,). 

The reader may wonder if the above metrics are the only metrics that can be 
placed on X x Y to yield the product space of (X,d,) and (Y,d,). One may also 
wonder why we say the product space instead of a product space. After all, each 
different metric on Z yields a different metric space. 

The answer to the first question is no. Except for the degenerate situation 
where X and Y each contain exactly one point, there is an infinite number of 
metrics which can be placed on Z to yield the product space. The metrics given 
whove are merely examples from this infinite set. On the other hand, they are the 
ones most often employed. 

The answer to the second question is that the metric spaces obtained are 
indeed different, yet in a sense they all turn out to be the same or equivalent. 
(We shall see this in Section 11.) An understanding of exactly in what sense they 
ure equivalent must await the introduction of a few more topological concepts. 
Suffice it to say here that Z = X x Y with any of the above five metrics is the 
product space. 

We will denote product spaces by (X,d,) x (Y,d,) or, where no confusion 
can arise, simply by X x Y. 

Let us now consider a few examples. 

EXAMPLE 3. Returning to Example | of this section, suppose that in addition 
(o characterizing the deflection of the vibrating string we want to characterize the 
rute of change of the deflection. Thus, if z = z(t,x) denotes the deflection of the 
string as a function of time and location, 


v = v(t,x) = < 2(t,x) 


will be the velocity of the string as a function of time and location. We will assume 
that v(t, x) exists and that for each ¢ it is a continuous function of x. We can then 
view the rate of change of deflection at time f as a point in the metricspace {C[0,L], d} 
discussed in Example 1. The evolution of this velocity can be viewed as a curve 
w(t, °) in {C[0O,L],d} parameterized by ¢. We can view the simultaneous deflection 
“and rate of change of deflection v as an ordered pair {z,v}, and we can view this 
ordered pair as a point s in the product space (C[0,L],d) x (C[0,L],d). The evolu- 
tion of deflection and rate of change of deflection can be viewed as a curve s(t, :) 
in this product space parameterized by t, where 


s(t,x) = (z(t,x), v(t,x)). fj 


60 TOPOLOGICAL STRUCTURE 


EXAMPLE 4. Suppose that we are interested in comparing the economic 
performance of the United States economy from year to year. Further, suppose 
we do this by following certain economic indicators. For example, let the ordered 
N-tuple x = {x(n)} = {x(1),x(2),...,x(N)} be the daily Dow Jones industrial 
average, where N is the number of days during the year for which averages are 
given. (We assume that N does not change from year to year.) Letting X be the 
set of all possible N-tuples x, we assume that a good measure (for some purpose 
not stated here and not guaranteed) of differences in yearly performance is given by 


1 N 
dy(x,y) = rs 2 Ix(n) — y(n)|. 


We also might be interested in the monthly cost of living index. Let the 12-tuple 
z = {z(1),z(2),...,z(12)} be a year’s record of this index. Let Z denote the set of 
all possible z’s and assume that 


d,(z,w) = max |z(k) — w(k)| 


is a “‘good”’ measure of differences in yearly performance of the cost of living 
index. 

We can view the record x of the Dow Jones average for a given year as a 
point in the metric space (X,d,) and the record z of the cost of living index as a 
point in the metric space (Z,d;). Finally, the behavior of the economy for a given 
year, say 1950, would be the ordered pair py = (X%0,Z,), where X, and Z, are values 
for 1950. If we form the product (P,d) = (X,dy) x (Z,dz), Po is obviously a point 
in it. Then, if p, is the behavior of the economy for 1961, the “‘distance’’ between 
Pp; and po is d(p,,p.). Of course, we can use any of the metrics (a), (b), (c), (d), or 
(e) for d(p,,p.). As far as topological questions are concerned, our choice will 
turn out to be unimportant. On the other hand, our choice will obviously affect 
the way we interpret d(p,,p.). However, the question of interpretation is a part 
of the mathematical modeling problem and not a purely mathematical question. J 


Thus far we have discussed the product of two metric spaces. One can also 
define the product of more than two metric spaces. For example, assume that we 
are given n metric spaces (X,,d,), (X.,d,),..., (X,,.d,). Let 


X=X, Oe eee ae, Oa fp. ee 
i=1 


where the ordered n-tuple x = (x,,...,x,),*;€ X;,i=1,2,...,”, is an element 
of the set X. As before there are many ways of defining metrics on X; however, 
the three most common ways are 


(a) Ay(x,y) = dx.) + 0° + (Xn In) 
(b) da(x,y) = [dy(xy.y1)? +t + dX pn Yn)? 
(c) d..(x,y) oT max{ds(%1)1), coos d(Xp Yn)}- 


3.5. CONTINUOUS FUNCTIONS 61 


Later, after we define the concept of equivalence between metrics in Section 
11, we shall show that the above metrics on X are equivalent to one another. 


EXERCISES 


1. Let € =(€,,¢,,¢3) be a R°-valued random variable (see Appendix E) defined 
on a probability space (Q,4,P). Let = denote those random variables & with 
E(\é|?) < oo, where E denotes the expectation and |€|? = €,7 + €,? + €,?. 
Show that 


d(E, v) = {E(\€ — v|?)}*” 
is a metric on =, where € —v = (€, + v,,€2 — v2,€3 — V3). 


(If B denotes a prescribed set in R°—say a box—one sometimes is interested 
in the subspace A of = given by 


A={cea: PC ¢ B) = 0}, 
that is, the probability € does not lie in B is zero.) 


2. Show that the thrust vector f(t) = (f,(¢), (0), f4() for a rocket engine can be 
modeled as a point in the product of metric spaces. 


3. Let N = {1,2,...} be the natural numbers with the usual metric d(n,m) = |n — ml. 
On N x N define a function by 


p(n ,n2), (my,m2)) = |ny mz — nzmM,| -(nyzm2)~*. 
Show that p is a pseudometric. Show that 
p((n,,1), (m,,1)) os [74 = m,| 
1 


Mo 


p((1,n2), (1,m,)) as 


4, Find some other examples of subspaces and product spaces. 


5. CONTINUOUS FUNCTIONS 


As has already been suggested, the introduction of abstract metric spaces 
allows the generalization of the concept of continuity. This generalization 1s one 
of the chief reasons for investigating metric spaces. An equally important reason, 
the generalization of the concept of convergence, is discussed in the next section. 

Let us recall the definition of continuity for real-valued functions F: R- R. 
We say that F is continuous at x, if for every real number «¢ > 0, there is a real 
number 6 > 0, such that |F(x) — F(x,)| < &, whenever |x — x9| < 6. The function 
F is said to be continuous if it is continuous at each point in the domain of defini- 
tion. In this case the appropriate 6’s are determined by x, as well as ¢, so we 
sometimes write 6 = d(e, X.). It can happen that a 6 can be found which “ works”’ 


62 TOPOLOGICAL STRUCTURE 


for all x). More precisely, the following may hold: “‘ For every ¢ > 0, there is a 
6 > Osuch that for any x, in R one has |F(x) — F(xo)| < ¢ whenever |x — x| < 6.” 
Here 6 depends on « only, and F 1s said to be uniformly continuous. 

Recall that a function F: X¥ > Y is a set-theoretic concept. In order to define 
continuity we must require that the sets X¥ and Y have some additional structure. 
In particular, the additional structure nesded is a topological structure. More 
precisely, we will use the topological structure generated by metrics on X and Y. 


3.5.1 DEFINITION. Let F: X > Y be a mapping of the metric space (X,d,) 
into the metric space (Y,d,). The mapping F is said to be continuous at the point 
X) in X if for every real number « > 0, there exists a real number 6 > 0 such that 
d,(F(x),F(Xo)) < ¢ whenever d,(x,x)) < 6. The mapping F is said to be continuous 
if it is continuous at each point in its domain. 


Note that this definition includes continuity of real-valued functions of a real 
variable as a special case. In particular, Definition 3.5.1 reduces to the familiar 
definition of continuity when X = Y=R and d,(x,y) = d,(x,y) = |x — y|. Also 
note that, as before, if / is continuous at a point x,, then the 6’s are determined 
by Xo as well as ¢ and we write 6 = 6(é,x9). The concept of uniform continuity is 
generalized in the obvious way. 


3.5.2 DEFINITION. A mapping F of a metric space (X,d,) into a metric space 
( Y,d,) is said to be uniformly continuous if for each ¢ > 0, there exists a 6 = d() > 0 
such that for any x, one has d,(F(x),F(xo)) < ¢ whenever d,(x,x,) < 6. 


The function 6d(,x.) or O(e) is sometimes referred to as the modulus of 
continuity of F. 

Let us consider a few examples of some mappings that are continuous and 
some that are not. 


EXAMPLE |. Let X = R" and Y=R", and consider mappings F of X into 
Y that can be represented with the following matrix formulation. 


V1 Qi M2 “'° Ay xX 
= a1 eee 
b) 
Ym ant ale Amn Xn 


where the a;;’s are real numbers. Consider the following metrics on X and Y. 
For X: 

d,(usv) = { |uy — 94? + 00+ + [Uy = oQl?}. 
For Y: 


d,(w,z) =a {|w, 7 z,|? 2 ea [Wn a Zale 


3.5. CONTINUOUS FUNCTIONS 63 


Let x, be an arbitrary element in_X, that is, 


XOn 


The image of x, under a mapping F is given by 


You Qy1 Xo, + Ay2Xo2 + °°* + yy Xon 

_ | Yor | _ | 421%01 + "** + Aan Xon 
AA scam (ARN (dies wae ’ 

Yom AniXo1 T eo cs Ami Xmn 


It follows that if x is another arbitrary point in X and y is its image, then 


m 


d(y.Yo)? ack iz 


n 2 
» a; (Xj — Xo,) 


j=l 


Using the Schwarz Inequality (Appendix A), it follows that 


d(y.Yo)” Ss » ( » a.) ( y |x; = Xo,! 
i=1\j=1 j=1 
< A7d(x,x9)’, 
where A =(),,|a,,;|7)'/?. Then given an ¢>0 choose 5 =8/A, provided A #0. 


It is clear that d(y,yo) < « whenever d(x,x9) < 6. Hence every such mapping F 
is continuous. (What happens if A4=0?) J 


EXAMPLE 2. Consider Y = L,(— 00,00) with the usual metric (see Example 
10, Section 3) and let X be the subspace of Y made up of all points x in Y such that 


i 


=~ 


t 


{ x(t) dt 


—-e 


2 
dt < oo. 


It follows that 


t 


y(t) = J © dt 


represents a mapping of X into Y. Denote this mapping by F. 

The mapping F is not continuous; in fact, it is not continuous at any point 
in X. In order to show this, we start with an arbitrary x) ¢ X and we then seek 
an € > 0, say &), such that no matter which 6 > 0 we consider there is always at 
least one x € X with the property that d(x,xo) < 6 and d(F(x),F(%o)) = &. 


64 TOPOLOGICAL STRUCTURE 


Set & = 1. Let x be an arbitrary point in X, and let yp = F(x.) and y = F(x). 
Then 


y(t) — volt) =f Ex(2) — xo(t)] de. 


Let x be chosen so that 


c, O<t<3T? 
x(t) —x,(t)={-—c, 37? <t<6T? 
0, otherwise. 


Then d(x, xo) =,/6cT and 


ct, 0<t<3T’ 
y(t) — y(t) = (c(6T? — tt), 3T? <t < 6T? 
0, otherwise. 


Furthermore d(y,y)) =./18cT*. Let 6 > 0 be given and choose c and T so that 


/18cT? = | and /6cT <6. 


One then has d(x,x,.) < 6 and d(y,y.) = 1, which proves that F is not continuous 
at xX). Since xX» is an arbitrary point in X, we see that Fis nowhere continuous. J 


EXAMPLE 3. Let Y be the metric space L.,(—io0,ioo) defined in Example 
13, Section 3 and let X be the subspace of ( Y,d) made up of all functions such that 


© | V(j 2 


It then follows that if Y = FX is given by 


Xi 
T6d) 

this represents a mapping F of X into Y. Using the same general approach as used 

in Example 2, it can be shown that Fis not continuous at any pointin X. J 


The reader undoubtedly recognizes that Examples 2 and 3 are related to one 
another through the Fourier Transform, or the two-sided Laplace Transform. 


EXAMPLE 4. This example is similar to Example 2 except for the important 
difference that the functions considered are defined on a finite instead of an infinite 
interval. Let ¥ = Y= L,[0,7], where 0 < T < o, be given with the usual metric. 
Then 


v(t) = f(s) dr 


3.5. CONTINUOUS FUNCTIONS 65 


represents a mapping F of L,[0,7] into itself. Moreover, F is continuous. Let xo 
und x be arbitrary points in X, and let yo = F(x,) and y = F(x). Then 


I) = vol] =| ExCe) — xo(0)] a 


< co — xX9(1)| dt 
< [fv a} [fe — Xo(1)|? ac} 


= VT {fi - xr ac} 


where the Schwarz Inequality for integrals (Appendix A) is used in the second to 
the last step. Therefore, 


[0 = yolP ats Tf Ix(o) — xo) ae 
or 
U(y,Yo) < Td(x,Xo). 
This last inequality implies that F is uniformly continuous. (Why?) J 


The reader should carefully compare Examples 2 and 4. 


EXAMPLE 5. Let the metric spaces X and Y be the same as those used in 
Example 4. Let k(t,t) be a real-valued function defined on [0,7] x [0,7] such 
that 


T ,.T 
i i \k(t,1)|2 dt dt = M? < o. 
0 *0 
Then 
T 
y(t) = | k(t,2)x(t) dt 
0] 


represents a continuous mapping K of L,[0,7] into itself. 
Note that this is a generalization of Example 4 where 


1, fort>t 
a - otherwise. 


Using essentially the same argument as used in Example 4, we find that 
dy. o) < M A(x,Xo). 


Hence, K is a continuous mapping of L,[0,7'] into itself. J 


EXAMPLE 6. If in Example 5 we use L,(— 00,00), then we have a continuous 
mapping of L,(— 00,00) into itself. However, the requirement that 


[ [kG 9? ac dt=M? < (3.5.1) 


66 TOPOLOGICAL STRUCTURE 


is quite restrictive. That is, there are many integral operators with kernels that do 
not satisfy (3.5.1) but which do correspond to continuous mappings of L,(— 00,00) 
into itself. For example, 


—(t—t) 

e fort>T 
k(t,t) = ; <2 

(47) MH otherwise 


corresponds (see Exercise 17, Section 5.6) to a continuous mapping of L,(— 00,0) 
into itself, but 


oO t 
{ { e"""V dr dt=+0. J 
- 0 ~ —0 


EXAMPLE 7. Let X = Y =/,(0,00) be given with the usual metric. (See 
Example 4, Section 3.) 
Let A be an infinite matrix 


such that 
Y YD laijl? = M? < o. (3.5.2) 
j=1 i=1 


Using the Schwarz Inequality for infinite series the reader should be able to show 
that the following matrix formulation represents a continuous mapping of /, into 


itself: 
Vi Qyy, G2, °° ET x 
Yo} = [@a1, °°" X2]- (3.5.3) 


It should be said that (3.5.2) is not a necessary condition for (3.5.3) to represent 
a continuous mapping of /, into itself. J 


A final yet important point concerns the continuity of inverses. If an invertible 
mapping F is continuous, the mere continuity of F does not say anything about the 
continuity of F~*. Similarly, if F is not continuous, then this fact alone does not 
say anything about the continuity of F~’. 


EXERCISES 


1. Let C°[0,7] denote the set of all real-valued infinitely differentiable functions 
x defined on 0 < t < T. Let D be the mapping of C“[0,7)] into itself defined by 


3.5. CONTINUOUS FUNCTIONS 67 


(a) Let d, be the sup-metric on C*%[0,7]. Is D a continuous mapping of 
(C~[0,7'],d,) into itself? 
(b) Define a metric d, on C”[0,T] by 


d,(x,y) ed d,(x,y) 25 d,(Dx, Dy). 


Is D a continuous mapping of (C”[0,7'],d,) into (C”[0,T],d,)? 


A delay line is a device whose output is ideally a delayed version of its input. 
That is, a mathematical model for a delay line as shown in Figure 3.5.1 is 
given by 


y(t) = x(t — 1). 


Delay line 
with a delay 
of 7 seconds 


xo 


Figure 3.5.1. 


Suppose that x, ye L,(— 00,00), where L,(— 00,00) has the usual metric. Is 
the mathematical model of the delay line a continuous mapping of L,(— 00,00) 
into itself? 


Let Y = C[0,T] be given with the sup-metric d(x,y). Let 
X = (C(0,T],d) x (CL0,7],d) 


be the product space. Consider the mapping F of X into Y defined by 
F(x) = x,X2, where x = (x,,x2). (That is, Fis a ‘‘ multiplier.’’) Is F continuous? 
Is F uniformly continuous? 


. Define continuity and uniform continuity for functions defined on pseudo- 
metric spaces. Give examples of continuous and discontinuous functions on 
pseudometric spaces. 


. Let (X,d) be a metric space, where X¥ is nonempty. Let Y = BC(X,R) denote 
the collection of all bounded, continuous real-valued functions defined on X. 
(a) Show that the functions 


: d(x,Xo) 
es 1 + d(x,xo) 
fz. 2x7 d(x,x,) — d(x,Xo) 
f3i:x73 


are in Y. (So Y is nonempty.) 


68 


TOPOLOGICAL STRUCTURE 


(b) Show that 
o(f,g) = sup{ | f(x) — g(x)|: xe X} 


is a metric on Y. 


. Let (X,d,) and (Y,d,) be two metric spaces, where X and Y are nonempty. 


Assume that every set in Y is bounded. (See Exercise 3, Section 3.2.) Then also 
let Z = C(X, Y) denote the space of all continuous functions from X into Y. 
Show that Z is nonempty. Show that 


o( fg) = sup{d,(f(x),g(x)): x € X} 


is a metric on Z. What happens if one does not assume that every set in Y is 
bounded ? 


. (Holder continuity.) A real-valued function f defined on a closed, bounded 


interval J is said to satisfy an o-Hodlder condition on I if there is a constant k 
(called the Hélder coefficient) such that | f(t) —/(s)| < k|t—s|* for tse. 
Let C*(J) denote the collection of all such functions, where 0 < «. 

(a) Show that fe C*(J) for 1 <a, implies that f(t) = constant. 

(b) Show that 0 < B <« < 1 implies that C*(1) c C*(N). 

(c) Show that for 0 <a < 1, the function 


d*( f,g) =sup{|f(t) — g(t)|:t ET} 
spf If) —f(s)| _ |g(t) — a(s) 


lt— sl" ji—s\" 


neers 


is a metric on C*(J). 
(d) Show that the mapping f > f of (C*,d*) into (C’,d*), where 0 < B<a<1, 
is continuous. 


. Let (X,d) be a metric space and let LC(X,R) = LC denote the space of real- 


valued (globally) Lipschitz continuous functions defined on X. That is, fe LC 
if there is a constant k such that 


f(x) -—f(y)| < kd(x,y) = (% ye X). (3.5.4) 


Let ||f|| denote the smallest number k that satisfies (3.5.4). (This is a pseudo- 

norm, see Exercise 3, Section 5.2.) 

(a) Show that if ff ge LC and A(x) = f(x) + g(x), then he LC. Also afe LC, 
where a is any real number. 

(b) Show that o(f,g) = || f—g|| is a pseudometric on LC. Discuss the relation- 
ship o(f,g) = 0. 

(c) Show that LC is nonempty. [Hint: Consider f(x) = d(x,x 9) where xo is 
a fixed point in X.] 

(d) Let LC, denote those f in LC that satisfy f(x9) =0. Show that o is a 
metric on LC,,. Compare the space LC,, and LC,,, where xo # x. 
[Note: |f|| is a “norm” on LC,,, see Section 5.2.] 


9. 


6. 


3.6. CONVERGENT SEQUENCES 69 


(Continuation of Exercise 8.) Let LC,,,* denote the collection of all functions 
I: LC,, > R that satisfy the following conditions: 

(i) If + 9) = I(f) + 109). 

(ii) Jaf) = al(f). 
(iii) sup{|l( f)|: Il fils 1} < oo. 
For x eX, let 5, be the Dirac-function 6,( f ) = f(x). 
(a) Show that 6, € LC,,.* for every x € X. 

For /,, 1, in LC,,,*, let 


WA — Ail = sup) —- LOI: If < 13. 


(b) Show that ||6, — 6,|| < d(x, y), for all x, ye X. 
(c) Show that o*(/,, /,) = ||, — /,|| is a metric on LC,,*. 


. (Continuation of Exercise 12, Section 3.) Let X denote the class of all complex- 


valued functions f(z) that are analytic for |z| <1. Let D: X>WNX be the 
differential operator D: f—df/dz. By using the metrics, or pseudometrics, 
in Exercise 12, Section 3, discuss the continuity of D. 


. Let f: X > X be a continuous mapping, where X has a metric d. Let Gr(f) 


denote the graph (Section 2.6) of fin X x X and A the diagonal set 
A = {(x%y): x = y} = {((x,x): xe X}. 
Assume that X x X has the metric 


A((x15)1), (x2 V2) _ A(X ,X2) at: Ay 1,V2). 


Define g: A— Gr(f) by 
g(x,x) = (x, f()). 


Show that g is continuous. Show that g is invertible. Is g~' continuous? 


. In Definition 3.5.1 we used the strong inequalities ‘* d,(F(x), F(x9)) < e”’ and 


“di (x,Xo) <6” to define continuity at x). Show that one can replace either 
or both of these inequalities with the weak inequalities ‘‘d,(F(x), F(x9)) <e”’ 
and ‘‘d,(x,X9) < 6,” without changing the concept of “continuity at x,.” 


. Show that the mapping f: N x N -> Q, given by (n,,n,) > (n,/n,), is continuous 


when Q has the usual metric and N x N has the metric given in Exercise 3, 
of Section 4. Is f one-to-one? 


CONVERGENT SEQUENCES 


In addition to the foregoing generalization of continuity, the introduction 


of metric spaces allows us to generalize the concept of convergent sequences. 
First, let us recall the definition of a convergent sequence of real numbers. We say 
that a sequence of real numbers {x,} = {x,,x2,...} is convergent if there is a real 
number x, with the property that for each real number ¢ > 0 there is an integer 
N such that |x, — xX9| <« whenever n> WN. We say that x, is the limit of the 
sequence {x,}. Since an appropriate integer N is determined by «, we sometimes 
write N = N(e). Our intuitive picture is that N becomes larger as e becomes smaller. 


70 TOPOLOGICAL STRUCTURE 


Suppose that instead of real numbers we have a sequence of points in a metric 
space (X,d). We again denote our sequence by {x,} = {x,,x2,...}. The generaliza- 
tion is straightforward. 


3.6.1 DEFINITION. A sequence {x,} of points in a metric space (X,d) is said 
to be convergent if there is point x, in (X,d) with the property that for each real 
number ¢ > 0 there is an integer N such that d(x, ,x,.) < « whenever n > N. The 
point xq is said to be the limit of the sequence {x,}. This is sometimes written 

lim x, = lim x, = Xo; 
noo 
or 
X, 7 Xo ASN ©. 
Note that x, is referred to as the limit and not as a limit. In doing so we are 
anticipating the next result which says that a convergent sequence in a metric space 
has precisely one limit. 


3.6.2 LEMMA. Let {x,} be a convergent sequence in a metric space (X,d). If 
Yo and Xo in (X,d) are limits of the sequence {x,}, then yo = Xo. 


Proof: If wecan show that d(xo,yo) < 2e for every ¢ > 0, then it follows that 
A(X Vo) = 0 and, from axiom (M2), that x» = yo. Let e >0 be given. Since xq 
isa limit of {x,}, there isan N, suchthatn > N, implies that d(x, ,xo) < ¢. Similarly, 
there is an N, such that n> WN, implies that d(x,,yvo) < «. Let M = max[N,N,]. 
Then by the triangle inequality one has 


A(X Yo) S AXu Xo) + Xu Vo) <2e. J 
Carefully note that the limit x) must be a point in (X,d). For example, if XY 


is the open interval 0 <x <1 and d(x,y) = |x — y|, the sequence {4,4,1,...} is 
not convergent because the apparent limit 0 is not a point in (X,d). 


EXAMPLE |. Suppose that X = C[0,1] is given with the metric 


1 1/2 
dacxy) ={f Ix) — vio? a 
Consider the sequence {x,} in (C[0,1],d,), where 


forO<t<1/n 


1 — nat, 
xAt) = . for I/n<t<1. 


We claim that this sequence converges and the limit is xo(t) = 0. Indeed 
1 1/2 1/n 1/2 
daly, %0) = {flea = x0(0 de} = [f= np? atl = ny"? 
0 0 


Given an > 0, if n => N, where N is an integer greater than e ”, then d(x, xo) < €. 
Hence, xX» =lim,.,, x,- §j 


3.6. CONVERGENT SEQUENCES 71 


EXAMPLE 2. This example is the same as Example 1 except for a change of 
metric. Here we use the sup-metric d,,. The sequence {x,} is now a sequence of 
points in the metric space {C[0,7 ],d,,}. Obviously the sequence does not converge 
to x,(t)= 0, for d,(x,,X 9) = 1 for each n. One might be tempted to say that 
{x,} converges to yo, where 


ive 1, fort=0 
YI = 10 for0<1t<T. 


But this would be nonsense, for yo is not a point in {C[0,7'],d,,}. In fact the 
sequence {x,} just does not converge in this metric. We can see this by noting that 
for an arbitrary n we can find an N >jn such that d,.(x,,x,,) => 4 for all m>N. 
(Why does this observation show that {x,} cannot converge? See Exercise 8 
below.) J 


It should be mentioned that in practice we are not usually confronted with 
the problem of determining whether or not a point x, is the limit of sequence 
{x,}. Usually we are primarily interested in determining whether or not a given 
sequence is convergent. Knowing this, we sometimes then seek the limit. Both of 
these problems are often difficult to solve. 


EXERCISES 


1. Let X denote the set of all bounded piecewise continuous” functions defined 
on 0 <1t<T, with the sup-metric d,, . 
(a) Obviously C[O,7] is a subspace of X. Let x, be an arbitrary point in 
C[0,7]. Suppose that x9 is to be approximated by piecewise constant 
functions as shown in Figure 3.6.1. That 1s, 


T ac T 
(0) = %0( i) for fe rl) and j=0,1,...,(n— 1). 


and 


x, (T) = xo(" - : T). 


Is it true that the sequence {x,} converges in (X,d)? If so, is it true that 
Xo = lim,..,,. X, ? [Hint: Use the fact that a real-valued continuous function, 
defined on a bounded closed interval is uniformly continuous, compare 
with Exercise 13, Section 17.] 


2 We say that a function x(t) is piecewise continuous on 0 <¢t < Tif (1) it is continuous at all but 
uw finite number of points in 0<t< T, (2) if x(t) is discontinuous at ¢,, then left and right limits 
of x(t) exist as ¢ approaches f, from the left and right, and (3) if x(t) is discontinuous at ¢,, then 
x(f;) equals cither the left or the right limit of x(t). In other words, we allow only a finite number 
of “simple” discontinuities. 


72 


TOPOLOGICAL STRUCTURE 


x, (0) 


Xo(t) 


aly 


2T (n-1)T 
n n 


Figure 3.6.1. 


(b) Suppose x, is not restricted to the subspace C[0,7], and suppose that x, 
is approximated by functions x, as in (a). Is it true that x) = lim,_.,, X,? 
(c) Consider a different metric on X, namely, 


T 1/2 
dey) ={f be) — oP di 


Does the sequence {x,} converge to x, 1n (X,d,)? 


. Suppose that in a metric space (X,d) a sequence {x,} converges to a point xo. 


Does it follow that 
a(x; ,Xo) = A(x Xo) = A(x; Xo) Be HS oe A(X Xo) eee ? 


Either prove that it does, or give a counterexample. 


. Consider the metric space (X,d) given in Example 16, Section 3. Characterize 


the collection of all convergent sequences in (X,d). 


. Consider the sequence {x,}, where 


x,(t) = [cos(n!nt)]"", n=1,2,... 


in the metric space C[0,1] with the sup-metric d,,. Is {x,} a convergent 
sequence? 


. If {x,} and {y,} are convergent sequences in metric space (X,d) show that the 


sequence of real numbers {d(x,,y,)} converges to d(x 9,Vo), where Xo = 
lim,.. X, and yo =lim,..,, y,. (This exercise should be reconsidered after 
studying Section 7.) 


. Define a concept of sequential convergence for pseudometric spaces. Are the 


limits of sequences unique in pseudometric spaces? Give some examples. 


. (a) (Square Root Algorithm) Let a be a real number satisfying 0 < a < 1 and 


set b= 1-—-—a. Let yp = 0 and 
Yn41 = 4(D + Yn’). 


Show that the sequence {y,} is bounded and monotone, and therefore 
convergent. Let y= lim y,, and x = | — y. Show that x? = a. 


3.6. CONVERGENT SEQUENCES 73 


(b) Modify part (a) for the case a> 1. 
(c) What happens when a < 0? When a is complex? 


. Let {x,} be a sequence in a metric space (X,d) with the property that for some 


& > 0 one has d(x, ,x,,) > & for all n, m. Show that {x,} is not convergent. 


. Let X = R” be given with the metric d(x,y) =) 7.,|x;—y;|. Show that a 


sequence {x,} in X converges to x, if and only if lim x, -: Xo; for each 
coordinate x) ;, 1 <i<n. 


. Let x,(t) be a sequence of continuous real-valued functions where x,(t) is 


periodic with period 1, > 0. Assume that x,(t)— x(t) uniformly for ¢ in R and 
t, > t. Show that x(t) is periodic in t with period t. 


. Let f and g be functions in C[0,7]. Define x9 =f, x, = g, and 


Rg = AN hey = le Zetec 


Show that lim x, = 4( f+ 2g) when C[0,7] has the sup-metric. 


. Consider the space C[0,1] with the sup-metric d,,. Let 


2 
t? + (1 —nt)*’ 
Show that for each ¢, g,(t) ~ 0 as n > o. Show that d,,(g, ,0) = 1. 


g(t) = a a eee 


. Consider the sequence in Exercise 12 in the space L ,[0,1] with the usual metric 


d,(f9)= | If ~ 9(01 dt. 


Show that d,(g,,0) 70 as n-— oo. [Hint: Use the Lebesgue Dominated Con- 
vergence Theorem, Appendix D.] 


. Consider the space C[0,7'] with the sup-metric d,,. Assume that {f,} is a 


sequence with f,>/f in (C[0,7], d,,) and {t,} is a sequence with t,>¢ in 
[0,7]. Show that f,(t,) > f(t) in R. (Use the usual metric on R and [0,T].) 


lim y (1-14 SP 4 MY : 


[Hint: Apply Exercise 14 to f,(t) =) fio t 0<t< 0.91] 


. Show that 


. Can a sequence of discontinuous functions converge uniformly to a continuous 


function? 


. Consider the following sequences of function {f,} defined for 0 < t < oo. Find 


all intervals on which these sequences converge uniformly. 


i 


t" 
(a) = (b) re: (c) t 

e aa sin n7t 
(d) "1 (e) - e (f) a 


74 TOPOLOGICAL STRUCTURE 


7. A CONNECTION BETWEEN CONTINUITY AND CONVERGENCE 


Although it may appear that continuity and convergence are unrelated con- 
cepts, there is a very important connection between them. It is the purpose of this 
section to investigate this connection. In particular, we consider the problem 
of interchanging limits and functions; that is, when does F(lim,.,x,) equal 
lim, F(x,)? 


3.7.1 THEOREM. Let F: (X,d,) > (Y,d,) be given and let x, be a point in X. 
Then the following statements are equivalent: 


(a) F is continuous at Xo. 
(b) lim F(x,) = F(lim x,), for every sequence {x,} with the property that 
lim Xn = Xo F 


Statement (b) needs a word of explanation. It asserts two things: (i) The limit 
of {F(x,)} exists and (ii) this limit agrees with F(x,)) = F(lim x,). 


Proof: (a)=>(b). Assume that F is continuous at xo and let {x,} be a 
sequence with the property that lim x, = x). We want to show that lim F(x,) = 
F(x9). Since F is continuous at x9, for every « > 0 one can find a 6 > 0 such that 
d_(F(x),F(9)) < € whenever d,(x,x9) < 6. Since lim x, = x), there is an N such 
that d,(x,,Xo) < 6 whenever n > N. By combining these two statements we have 
d,(F(x,),F(%o)) < € whenever n > N. Hence lim F(x,) = F(x). 

(b) => (a). Let F(x,) > F(xo) whenever x, > xX). If Fis not continuous at Xo, 
then there exists an €) > 0 such that for each 6 > 0 there is an x with d,(x,x9) <6 
and d,(F(x),F(Xo)) = &. Let x, be such an x for 6=1, x, for 6=4, and in 
general, x, for 6 = 1/n. We then have d,(x,, Xo) < 1/n and d,[F(x,),F(%o)] = &, 
for all n. That is, the sequence {x,} converges to x9, but quite clearly the sequence 
{F(x,)} does not converge to F(x,). But this contradicts our assumption that 
F(x,) > F(x 9) whenever x, — X). Hence F must be continuous at x). jj 


The following theorem is an immediate consequence of Theorem 3.7.1. 
3.7.2 THEOREM. Let F be a mapping of (X,d,) into (Y,d,). The mapping 
F is continuous if and only if 
F(dim x,) = lim F(x,) (3.7.1) 
for every convergent sequence {x,} in (X,d,). In other words, a mapping F is con- 


tinuous if and only if it preserves convergent sequences. 


If F is not continuous, it still may be true that F(lim,..,, x,) = lim, F(%,) 
for some convergent sequences; however, because of Theorem 3.7.2, this equality 
cannot hold for all convergent sequences. 


3.7. CONTINUITY AND CONVERGENCE 75 
The last two theorems seem simple and unobtrusive. However, they are very 
important. They will be used time and time again in this book. The reader should 
master them before proceeding. 
EXAMPLE 1. Let X = C[0,7] be given with the sup-metric d,, . Since 


T T T 
[ x@at—] y@ dt} <J Ix) - vol at < Ta, y), 


we see that the function { is a continuous function from X into the reals R. Let 
{x,} be any sequence of functions in X. We know that x,, > x, in (X,d) if and only 
if the sequence of functions {x,(t)} converges uniformly to x(t). It follows from 
Theorem 3.7.2 that 


T T 
lim | x,(t) dt = { lim x,(t) dt (3.7.2) 
0) 0) 


provided the sequence {x,(t)} converges uniformly. J 


EXERCISES 


1, Are Theorems 3.7.1 and 3.7.2 true when we consider pseudometric spaces 
instead of metric spaces? 

2. Let (X,d) be the metric space C[0,T] with the sup-metric d,,, and consider 
the mapping K of X into itself represented by y = Kx where 


yi) = J k(tx() a 


where k(t,t) is continuous on [0,7] x [0,7]. Is it true that x, — x9 implies 
that K(x,) > K(%o)? 

3. Let (Y,d,,) be the metric space {C[0,T],d,,} and let (X,d,,) be the metric space 
C'[0,T] with the sup-metric d,,, where C'[0,7] is the set of all continuous 
functions on [0,7] with continuous first derivatives. Consider the differential 
mapping y = Dx of X into Y given by y = dx/dt. Is it true that x, — xo implies 
that D(x,) ~ D(x)? If not, are there any convergent sequences {x,} such that 
lim,. D(x,) = D(lim,.. Xn)? 

4. Let {x,} and {y,} be two sequences in a metric space (X,d) with x = lim x, and 
y =limy,. Show that if d(x,,y,) < k for all n, then d(x,y) < k. 


5. Use Theorem 3.7.2 to compute 
N 
log( lim tea 
N>o n=1 


6. Use Example | to prove the following: Let {x,} be a sequence in C'[0,7'] and 
assume that {dx,/dt} converges uniformly on [0,7] and that {x,(0)} converges. 
Then {x,} converges uniformly, say that x = lim x,, and moreover 


dx/dt = lim dx,/dt. 


76 


10. 


TOPOLOGICAL STRUCTURE 


. Redo Exercise 5, Section 6, using the results of this section. 
. Let f: (a,b) > X be defined on an interval a < t <b with range in a metric 


space (X,d). We say that f(t) x) as tb if for every e>0 there is a 

6 > 0 such that d(f(t),xo) < € whenever 0< b-—t <6. 

(a) Show that f(t) > x) as tb” if and only if one has f(t,) > x9 for every 
sequence f¢, witha < t, <b and t,—b. 

(b) Show that f(t) x9 as tb” if and only if there is an extension of f, 
f:(a,b] > X, that is continuous at b. 


. Statement (a) in Theorem 3.7.1 assures us that the function F is continuous 


only at the point x. 
(a) Construct an example showing that F need not be continuous elsewhere. 
(b) Using your example explain why Equation (3.7.1) is not valid. 


Let {x,} be a sequence of functions in L,(/) where J is a bounded interval, and 
1<p< oo. (See Example 10, Section 3.) Assume that there is an x in L,(J) 
such that d,(x,,x) ~ 0 as n— oo, where d, is the usual metric on L,(J). 

(a) Is it true that 


lim j x, at = [ xat? 


(b) What happens if J is an unbounded interval? 


Part B 


Some Deeper 
Metric Space Concepts 


8. LOCAL NEIGHBORHOODS 


The object of this section and the next is to show that the concepts of con- 
tinuity and convergence are (in a sense) independent of the metric. The first step 
is to characterize these concepts in terms of “‘ local neighborhoods.”’ The importance 
of this characterization is that it is a natural stepping stone to a deeper under- 
standing of topological structures. 


3.8.1 DEFINITION. Let (X,d) be a metric space and let x, be an arbitrary 
point in (X,d). The set 


BAX%o) = {x € X: d(x,X9) <r}, 

where 0 <r < is referred to as the open bail of radius r centered at x,.. The set 
B[xo] = {xe X: d(x,xo) <r}, 

where 0 < r < ©, is referred to as the closed ball of radius r centered at x). The set 
S,[xo] = {x € X: d(x,xo) =r}, 


where 0 <r < ©, is referred to as the sphere of radius r centered at xo. 
If one thinks of X= R° with 


d(x,y) = {1 — yw)? + (X2 - 2)? + (x3 — Vays 


then if xo = (0,0,0), S,[xo] is the ‘‘ boundary ”’ of the ball, B,(x9) is the “interior,” 
and B.[x,] is the “‘ boundary plus the interior.’’* 
We are now in a position to define local neighborhood.* 


3.8.2 DEFINITION. Let x, be an arbitrary point in a metric space (X,d). A 
subset N of (X,d) is said to be a local neighborhood of xy if Nis either B(x) or B.[ xo], 
where r # 0. The positive number r is said to be the radius of the local neighbor- 
hood N. The open balls B,(x,.) are sometimes referred to as open local neighborhoods 


3 This terminology is used merely to help the reader understand the nature of the defined sets. 
Later, we shall give a precise definition of “‘ boundary” and “‘interior.” (See Exercise 24, Section 
12.) 

“ We use the term “‘ local” neighborhood, even though we shall not discuss other types of neighbor- 
hoods. The reason is that the term ‘“‘ neighborhood”? has an accepted meaning which differs 
from our usage. (See Exercise 6.) 


78 TOPOLOGICAL STRUCTURE 


and the closed balls B.[x)], r > 0, are sometimes referred to as closed local neighbor- 
hoods. The terms “‘ open” and “‘closed”’ will be given technical meanings shortly. 
For the moment we shall use them only to distinguish between B,(x,) and B,[ xo]. 
We refer to the family of all local neighborhoods of a point x as the local neighbor- 
hood system of x. Thus, the local neighborhood system of x consists of all open 
balls and closed balls of nonzero radius centered at x. We shall denote this system 
by W(x). Let us consider some examples. 


EXAMPLE 1. Let X = C[0,7'] be given with the sup-metric d,. Let xo be 
an arbitrary point in (X,d). Then the set of all continuous functions x such that 
Ix(t) — x9(t)| < 4 for all te [0,7] is an open local neighborhood of x9. This set 
is illustrated in Figure 3.8.1. J 


The set of all 
continuous functions 

between these two dotted 
lines is an open local 
neighborhood of xg. 


Figure 3.8.1. 


EXAMPLE 2. Let X = C[0,T] be given with the L,-metric d,, that is, 
T 
d,(x,y)? = | Ix(t) — y(0)? at 
Let xX, be an arbitrary point in (X,d,). Then the set of all x e X such that 
T 
[fist — xo)? ath <4 
is an open local neighborhood of x). A few of the x’s in this local neighborhood 


of x) are sketched in Figure 3.8.2. Note that in contrast to Example 1, we do not 
have a convenient way to represent this local neighborhood graphically. J 


The following lemma states a key property of local neighborhood systems. 
Note that it is a consequence of the triangle inequality. 


3.8.3 LEMMA. Let B,(X9) be any open local neighborhood. Then for every x in 
BAXo) there is a local neighborhood N,, of x with N, < BA Xxy). 


3.8. LOCAL NEIGHBORHOODS 79 


Figure 3.8.2. 


Proof: Since xéB(x%o), we have d(x9,x)=a<r. By using the triangle 
inequality, we get 


A(X y) Ss A(X »X) + d(x,y) 


(or any ye X. In particular, if d(x,y) < B, where a+ fP=r, then d(xg,y) <r. 
In terms of local neighborhoods we have (see Figure 3.8.3) 


N,. = B,(x) = B,(Xo). 7 


S 


Bg (x) 


BAXo) 
Figure 3.8.3. 


Note that Lemma 3.8.3 is not true for closed local neighborhoods. (Why?) 
We next turn to the characterization of the continuity of a function F in 
terms of local neighborhood systems and the inverse set-function F~'. 


3.8.4 THEOREM. A function F mapping (X,d,) into (Y,d,) is continuous at 
d point Xo in (X,d,) if and only if the inverse image of every local neighborhood of 
I'(X%9) contains a local neighborhood of x,. (Figure 3.8.4 illustrates this condition.) 


80 TOPOLOGICAL STRUCTURE 


Nan 
arbitrary 
A tocal neighborhood local 
of xq contained in neighborhood 


PolCX) of F (\9) 


(Yd) (Yd) 


Inverse 
set-flinction 


FON) 


Figure 3.8.4. 


Proof: First assume that Fis continuous at x, and let N be any local neighbor- 
hood of F(x). Let e be the radius of N. By continuity of F there is a 6 > 0 such that 
F(x) € N whenever d,(x,xXo) < 6. In other words, B;(x9) ¢ F~'(N). 

Now assume that if N is any local neighborhood of F(x,), then F~4(N) con- 
tains a local neighborhood M of x,. If we let ¢ be the radius of N and 6 be the 
radius of M, we see that F is continuous at x). Jj 


Since the conditions stated in Theorem 3.8.4 are necessary and sufficient for 
the continuity of F at xj, an alternate but equivalent definition of continuity is 
possible. 


3.8.5 DEFINITION (ALTERNATE). A function F: (X,d,) — (Y,d,) is said to be 
continuous at a point Xp in (X,d,) if the inverse image of each local neighborhood 
of F(X) contains a local neighborhood of x,. (Note that this definition does not 
make use of the radius of any local neighborhood.) 


We now turn to a characterization of convergent sequences in terms of the 
local neighborhood systems, but first a definition. 


3.8.6 DEFINITION. A sequence x, of points in a set X is said to be eventually 
in a subset A < X if there exists an integer N such that x,¢ A for alln > WN. In 
other words, the sequence gets into A and stays in A after a finite number of 
terms. 

The next theorem is a direct consequence of the definitions of “convergent 
sequence’”’ and “‘eventually in.” 


3.8.7 THEOREM. Let {x,} be a sequence ina metric space (X,d). The sequence 
{x,} converges to Xq if and only if {x,} is eventually in every local neighborhood 


of Xo. 


3.8. LOCAL NEIGHBORHOODS 81 


As a result of this theorem, the following is obviously an alternate but equiva- 


lent definition for a convergent sequence. 


3.8.8 DEFINITION (ALTERNATE). A sequence {x,} of points in a metric space 


(X,d) is said to be convergent if there exists a point x, in (X,d) such that {x,} is 
eventually in each local neighborhood of xg. 


We see, then, that both continuity and convergence can be characterized in 


terms of local neighborhood systems. These facts are only a prelude to the charac- 
terizations given in the next section, characterizations which lead to certain deep 
and important concepts. 


EXERCISES 


1. 


Let X = R? be given with the metric d,(x,y) (Example 1, Section 3). Describe 
the local neighborhoods for d,, d,, and d,,. Describe the corresponding local 
neighborhoods in R°. 


. Let X be the set of all ordered n-tuples x of 1’s and 0’s; for example, x = 


{1,0,0,1,1,...,0}. Let d(x,y) = ““number of places where x and y differ.” For 
example, if x = {0,1,0,1} and y= {1,1,1,1}, then d(x,y) =2. Let x9 be an 
arbitrary point in the metric space (X,d), and discuss the local neighborhood 
system of Xo. 


. Carry out a development similar to the one presented in this section for the 


pseudometric space environment. 


. Suppose that (A,d) is a subspace of a metric space (X,d). Let x9 be an arbitrary 


point in A. How does the local neighborhood system of x9 considered as a 
point in the metric space (A,d) compare with the local neighborhood system of 
X 9 considered as a point in the metric space (X,d)? 


. Let (X,,d,) and (X,,d,) be metric spaces, and let (X,d) be the product of 


(X,,d,) and (X,,d,). Let x, and x, be arbitrary points in X, and X,, respec- 
tively. Then (x,,x,) is a point in X. What does the local neighborhood system 
of (x,,x2) ‘‘look like?” [Note: The local neighborhood system depends on how 
we put d, and d, together to form d (see Exercise 1).] 


. Let x9 be an arbitrary point in a metric space (X,d). A subset A of (X,d) is 


said to be a neighborhood of x, if A contains a local neighborhood of xo. 

(Note that the local neighborhoods are neighborhoods.) The set of all neighbor- 

hoods of a point Xo is said to be the neighborhood system of x). (Note that the 

local neighborhood system of x, is contained in its neighborhood system.) 

Show that the following statements are true for arbitrary x9: 

(a) The neighborhood system of x9 is not empty, and Xp is in each of its neigh- 
borhoods. 

(b) The intersection of two neighborhoods of Xp is a neighborhood of xq. 

(c) If A is a neighborhood of x9, then each superset B of A (that is, A c B) 
is a neighborhood of Xo. 


82 TOPOLOGICAL STRUCTURE 


(d) Each neighborhood of x, contains a neighborhood of x9 which in turn is a 
neighborhood of each of its points. That is, if A is a neighborhood of x, 
there is a neighborhood B of x, such that Bc A and if xe B, then Bisa 
neighborhood of x. 

[Remark: Carefully note that if A is a neighborhood of xo, it can easily 
happen that A is not a neighborhood of each of its points. Consider a closed 
local neighborhood, for example.] 


7. Consider a nonempty set X with the metric 


ls. xe 
Ux) = \ nae 


Describe the local neighborhoods of a point in X. 


8. Let d, and d, be metrics on a set X. Show that if there is a constant k > 0 such 
that d,(x,y) < kd,(x,y) for all x,yeX, then each d,-local neighborhood 
contains a d,-local neighborhood. 


9. Let A be a Lebesgue measurable set in R with finite Lebesgue measure m(A), 
and let y,(t) be the characteristic function of A. 
(a) Show that 


f=] rat + 9)xAs) ds 
is continuous and that f(t) < m(A) for all t. 
(b) Assume that m(A) > 0. Then show that the set 


D(A) = {x — y: x, ye A} 


contains a neighborhood of the origin. [Hint: Show that there is a neighbor- 
hood U of the origin with the property that f(t) > 0 for te U. Then show 
that for any ¢ with f(t) > 0 there is an s such that ¢ + s and s belong to A, 
and thus ¢ is in D(A).] 


9. OPEN SETS 


In the last section we gave characterizations of continuity and convergence 
in terms of local neighborhood systems. To a limited extent these characterizations 
show that continuity and convergence are independent of the metric; that is, the 
metric was used to define the local neighborhood systems, but after that the formu- 
lations of continuity and convergence were essentially given in set-theoretic termin- 
ology. In particular, one could test for continuity, for instance, without knowing 
the specific value of the radius of each local neighborhood. 

In this section we shall go one step further. We shall derive characterizations 
of continuity and convergence in terms of a distinguished family of sets, called 
open sets. This family is called the topology on the space. We shall see that the 
topology is generated by the metric, but in a very surprising way, it is independent 


3.9. OPEN SETS 83 


of the metric. In fact, we shall show that many diverse looking metrics generate 
the same topology! 

Let us see if we can obtain some insight into the topological structure of 
metric spaces by considering a special problem. Suppose we are given a set XY with 
at least two points. We know that we can consider different metrics on X, and for 
cach metric we obtain a different metric space. Let d, and d, be two distinct 
metrics on XY. Our problem is to investigate the relation, if any, between the metric 
spaces (X,d,) and (X,d,). The purpose of this investigation is to show that dif- 
ferent metric spaces can be essentially the same if we limit ourselves to questions 
of continuity of functions and convergence of sequences. 

Given, then, that (X,d,) and (X,d,) are different metric spaces, are there any 
ways that they can be the same as far as continuity and convergence are concerned ? 
One reasonable approach is to ask whether or not one, or both, of the following 
statements 1s true: 


(a) Let F be a mapping of the set X into an arbitrary metric space ( Y,d,). The 
mapping F: (X,d,) > (Y,d;) is continuous if and only if the mapping F: (X,d,) > 
(Y,d;) is continuous. See Figure 3.9.1. (In other words, the class of all mappings 
defined on X that are continuous with respect to d, is the same as the class of all 
mappings that are continuous with respect to d,.) 

(b) A sequence {x,} converges to a point x, in (X,d,) if and only if {x,} 
converges to X, in (X,d,). (In other words, the class of sequences that converge 
with respect to d, is the same as the class convergent with respect to d, and the 
limit is independent of which metric is considered.) 


It is easy to exhibit metric spaces (X,d,) and (X,d,) for which statements 
(a) and (b) are true. For example, if for each x the local neighborhood system 
generated by d, is exactly the same as the local neighborhood system generated 
by d,, then (a) and (b) are true. This occurs if d, = ad,, where « is any positive 
constant. Again, the fact that the radius of a local neighborhood changes from 
d, to d, has no bearing on the issue. 

A sufficient condition, then, for statements (a) and (b) to be true is that the local 
neighborhood systems be the same. However, this is not a necessary condition. 
In most interesting situations where (a) and (b) are true, the classes of local neigh- 
borhoods are not the same. For example, in the plane X¥ = R’, the metrics 


d,(x,y) = max{ |x, — |, |x2 — yal } 
und 
d,(x,y) =[ |x, — il? + [x2 — y2l77'? 


generate different local neighborhood systems, yet statements (a) and (b) are true 
for the metric spaces (R’,d,) and (R?,d,). Clearly we are not yet at the heart of 
(he matter. 

Let us give a name to the situation where statements (a) and (b) are true. 


84 | TOPOLOGICAL STRUCTURE 


3.9.1 DEFINITION. Let (CX, d,) and (X, d,) be metric spaces with the same 
underlying set X. The metrics d, and d, are said to be equivalent if both statement 
(a) and statement (b) are true. 


In the remainder of this section we shall show that there is a simple and 
elegant way to state necessary and sufficient conditions for the equivalence of two 
metrics. 

We start with the identity mapping J of the set X onto itself. Considering the 
metric spaces (X,d,) and (X,d,), J can be viewed as a mapping of (X,d,) onto 
(X,d,). Since I is one-to-one and maps X onto X, it is invertible, and 7~* maps 
(X,d,) onto (X,d,). That is, 

L: (X,d;) is (X,d) 
I~': (X,d,)> (X,d,). 


Given the metric space structure, it makes sense to ask whether or not J or J~' is 
continuous. Of course, it can easily happen that one or both are not continuous. 
The next theorem is one characterization of the equivalence of d, and d,. It is 
not, however, the final word. 


3.9.2 THEOREM. Let (X,d,) and (X,d,) be two metric spaces with the same 
underlying set. The following propositions are equivalent: 


(1) The mappings I: (X,d,) > (X,d,) and I~*: (X,d,) > (X,d,) are continuous. 
(2) Statement (a) is true. 
(3) Statement (b) is true. 


(4) The metrics d, and d, are equivalent. 


Proof: Since (4) is equivalent to (2) and (3) taken together, it suffices to show 
that statements (1), (2), and (3) are equivalent. 

(1) => (2). Assume that (1) holds and let F= F,: (X¥,d,) ->~(Y,d;,) be con- 
tinuous. We want to show that F = F,: (X,d,) > (Y,d,) 1s continuous, see Figure 
3.9.1. Let N be any local neighborhood of F,(x9) = F2(x9) in (Y,d3). Since F, 
is continuous, F, _‘(N) contains a local neighborhood M of xo in (X,d,). Since 
I~! is continuous, 

(I-')"'(M) = Mc F,~"(N) 


contains a local neighborhood L of xg in (X,d,). It follows from Theorem 3.8.4 
that F, is continuous at x). Since Xq 1s arbitrary, F, is continuous. We have shown 
that F, is continuous whenever F; is continuous. By using the continuity of J one 
can show, in the same way, that F, is continuous whenever F, is continuous. 

(2)=>(1). This is trivial. Simply apply (2) twice. First to the case (Y,d;) = 
(X,d,) and F = J, and then to the case (Y,d3) = (X,d,) and F= 17}. 

(1)<>(3). Theorem 3.7.2 asserts that a function is continuous if and only if 
it preserves convergent sequences. Now observe that (3) says that the mappings 
I and J~' preserve convergent sequences, fj 


3.9. OPEN SETS 85 


Figure 3.9.1. 


We see, then, that continuity and convergence are preserved in the sense of 
statements (a) and (b) if and only if J and J~*‘ are continuous. It now behooves us 
(o ask: When does this occur? The following lemma gives one answer in terms of 
the local neighborhood system. 


3.9.3 LEMMA. The identity mappings I: (X,d,) > (X,d,) and 
i> (X,d2) eas (X,d;) 
are continuous at x if and only if 
(1) each local neighborhood of x in (X,d,) contains a local neighborhood of 
v in (X,d,), and 
(2) each local neighborhood of x in (X,d,) contains a local neighborhood of 
x in (X,d,). 


The proof of this lemma involves merely noting that it is a restatement of the 
continuity of J and J~' in the style of Theorem 3.8.4. 


EXAMPLE |. Suppose ¥ = R?, the plane and 


d,.(x,y) = max{ |x, — y;|, 1x2 — Val} 
d,(x,y) = [lx, — yl? + [x2 — y2l7]’?. 


86 TOPOLOGICAL STRUCTURE 


Then local neighborhoods in (X, d,,) “look like squares’ and in (X,d,) they 
“look like circles.” Let x) be an arbitrary element of X. Obviously, each square 
centered at x, contains a circle centered at x, and vice versa. See Figure 3.9.2. 
Obviously, then, the metrics d,, and d, are equivalent. An analytic proof of this 
geometric fact will be given in Section 11. Jj 


Figure 3.9.2. 


EXAMPLE 2. Let X = R? again, and let 


0, ifx= 
dot») ={\ ae 


d,(x,y) =E |x, — |? + lx2 — y2/77'”. 


The local neighborhood system of each point x in (R*,d,) contains exactly two 
sets: the point set {x} and the space R? itself. (Why?) Thus each local neighbor- 
hood of a point x in (X,d,) contains a local neighborhood of x in (X,d), 
namely, {x}. However, the local neighborhood {x} in (X,d,) does not contain a 
local neighborhood of x in (X,d,). (The reader should show that J: (X,d,)- 
(X,d,) is continuous but J~': (X,d,) > (X,d,) is not.) I 


Lemma 3.8.3 says that each open local neighborhood in a metric space contains 
a local neighborhood of each of its points. Obviously there are sets other than 
open local neighborhoods that possess this property; for example, unions of open 
local neighborhoods. Heretofore we have used the term open in a relatively vague 
manner. Let us now give it a precise meaning. 


3.9.4 DEFINITION. A set A in a metric space (X,d) 1s said to be open if A 
contains a local neighborhood of each one of its points, that is, 1f for every x in 
A there 1s a local neighborhood N of x with Nc A. 


Note that the empty set @ and X itself are always open sets. Lemma 3.8.3 
shows us that the open local neighborhoods are also open sets in the technical 
sense just defined. 


And now for the most important definition of this section. 


3.9. OPEN SETS 87 


3.9.5 DEFINITION. The class of all open sets in (X,d) is referred to as the 
topology (generated by the metric d) and it is denoted by 7. 


Now let d, and d, be two metrics on a set X. Let 7, and 7 , be the topologies 
generated by d, and d, , respectively. The following theorem, which is the key result 
of this section, gives the promised characterization of the equivalence of two metrics. 


3.9.6 THEOREM. Let (X,d,) and (X,d,) be two metric spaces with the same 
underlying set X. Then d, and d, are equivalent if and only if 


F,=7 >. 


In other words, the metrics d, and d, are equivalent if and only if they generate the 
same class of open sets. 


Proof: First assume that 7, =.7.,. We will show that J: (X,d,) > (X,d,) 
is continuous. A simple modification of the argument can be used to show the 
continuity of J~!: (X,d,) > (X,d,). 

Let x) be any point in (XY,d,), and let N, be any local neighborhood of xo 
with respect to the metric d,. N, contains an open local neighborhood M, of 
Xo with respect to d,. Since 7, = .7 2, M, 1s also an open set with respect to d,. 
Therefore M, = I~'(M,) contains a local neighborhood of x9 in (X,d,). Thus, the 
inverse image of each local neighborhood of /(x9) contains a local neighborhood 
of x,. Hence J is continuous at x, and therefore everywhere. 

Let us now assume that J and J~! are continuous. Let A be an open set with 
respect to d,. Then for each x in A there is a local neighborhood JN, of x in (X,d,) 
with N,c A. Since J is continuous, the inverse image N, =~ '(N,) contains a 
local neighborhood N, of x in (X,d,). That is, xe N,; ¢ N, < A. We have thus 
shown that every open set in 7 , is also open in 7,, thatis, 7, < Z,. By repeating 
this argument and using the continuity of J~* we conclude that 7, < 7,. Hence 
oe 


We see, then, that open sets can be used to characterize the equivalence of 
metrics. It should not be surprising that this concept can also be used to charac- 
terize continuous functions and convergent sequences. As a matter of fact, these 
characterizations are so elegant that many authors use them as definitions. 


3.9.7 THEOREM. A mapping F:(X,d,)—(Y,d,) is continuous if and only if 
the inverse image of each open set in (Y,d,) is an open set in (X,d,). 


Proof: First assume that F is continuous and let A be an open set in (Y,d,). 
Let x ec F~'(A) be given and let y = F(x). Then y € A and since 4 is open there is a 
local neighborhood M of y with McA. By Theorem 3.8.4, F~'(M) contains a 
local neighborhood N of x. We then have Nc F'(M) c F”'(A). Hence F7'(A) 
is open. 


88 |= TOPOLOGICAL STRUCTURE 


Now let us go the other way. Assume that for each open set A in (Y,d,) the 
set F~'(A) is open in (X,d,). Let x be an arbitrary point in (X,d,), and let M be 
any open local neighborhood of F(x). Then F~'(M) is an open set in (X,d,) and 
x €F~'(M). Hence there is a local neighborhood N of x with Nc F-'(M). By 
Theorem 3.8.4, this shows that F is continuous, which completes the proof of the 
theorem. Jf 


Using the above result, it is almost a triviality to prove that the composition 
of two continuous functions is continuous. We leave the proof as an exercise. 


3.9.8 THEOREM. Let f: (X,d,) >(Y,d,) and g: (Y,d,) > (Z,d3) be continuous. 
Then the composition h = gf is continuous. 


Before showing that we can also characterize convergence in terms of open 
sets, let us digress and consider the open mapping concept. Theorem 3.9.7 says 
that F is continuous if and only if the inverse image of each open set is an open 
set. A standard mistake is to turn this theorem around and say the wrong thing. 
It is not true that continuous functions necessarily map open sets onto open sets, 
that is, if A is open and F continuous this does not imply that F(A) is open. For 
example, the constant mapping F: X > Y, where F(x) = yo maps every nonempty 
subset of X into {y }. Very often the subset {yo} 1s not open. 

However, functions that do map open sets onto open sets are important, so 
we introduce the following definition. 


3.9.9 DEFINITION. A mapping F: (X,d,)>(Y,d,) is said to be an open 
mapping if F(A) is an open set in (Y,d,) whenever A is an open set in (X,d,). 


We shall discuss this further in the exercises. Let us now look at the question 
of the convergence of sequences. 


3.9.10 THEOREM. Let {x,} be a sequence in a metric (X,d). The sequence 
{x,} converges to a point X, in(X,d) if and only if the sequence is eventually in every 
open Set containing Xo. 


Proof: First assume that lim x, = x). Then for every open set A containing 
Xo, there exists a local neighborhood N of x, contained in A. By Theorem 3.8.7 
{x,} is eventually in N, hence {x,} is eventually in A. 

On the other hand, if {x,} is eventually in each open set containing x), it is 
eventually in each open local neighborhood of x,, and consequently {x,} is 
eventually in each local neighborhood of xj. Hence lim x, = x), by Theorem 
3.8.7. | 


Needless to say, the conclusion of Theorem 3.9.10 can be and ts often used as 
an alternative definition for sequential convergence in metric spaces. 


3.9. OPEN SETS 89 


What happens if the topologies 7, and 7, generated by metrics d, and d,, 
respectively, are not the same? Obviously, the metrics d, and d, are not equivalent. 
Is there anything that we can say? In many situations there is. Suppose that 
IT ,<¢ F>, that is, each subset of X that is open with respect to 7, is also open 
with respect to 7,. In this situation we say that 7, is stronger than (finer than) 
J ,; or that 7, is weaker than (coarser than) 7 ,. Needless to say, if 7, is both 
stronger and weaker than 7,, then 7,=7,. 


EXAMPLE 3. Let X be a nonempty set and let 


_{0, ifx=y 
aCsy)= l, ifx#y. 


Since the open ball B,(x),0 <a <1, is simply the point set {x} for each point 
x in X, every subset of X is an open set. Thus, the topology Z generated by d 
is the class of all subsets of X. It follows that any function mapping (X,d) into 
some metric space is continuous. Moreover, 7 is obviously stronger than any other 
topology on X. 

We mention without proof that at the other extreme is the topology generated 
by the pseudometric p(x,y) = 0. It 1s made up of exactly two sets: ¥ and @, the 
empty set. This is the weakest possible topology. (See Exercises 4and 13.) J 


The importance of 7, < 7, is as follows: 


(a) Continuity with respect to 7, implies continuity with respect to 7. 
(Exercise 14.) 


(b) Convergence with respect to 7, implies convergence with respect to 
TJ ,. (Exercise 15.) 


It can happen that neither 7, > 7, nor 7,> 7,. In that case, we say that 
the topologies are incommensurable. Otherwise the topologies are commensurable. 


EXERCISES 


1. Let (A,d) be a subspace of a metric space (X,d). How is the topology of 
(A,d) related to the topology of (X,d)? 

2. Let (X,,d,) and (X,,d,) be metric spaces and let (X,d) be the product of 
these two metric spaces. How is the topology of (X,d) related to the topologies 
of (X,,d,) and (X2,d,)? 

3. Suppose that a precision cutting tool is to cut a piece in a form which can be 
represented by a curve Xo in C'[0,7]. The realized form of a given piece is 
represented by the curve x which is also in C'[0,7']. The problem is to place a 


90 TOPOLOGICAL STRUCTURE 


metric on C’[0,7] which meaningfully characterizes the way in which x 
differs from x9. Let the possible metrics be 


d,.(x,y) = sup-metric 


doy) ={f bO- oral” 
T 1/3 
d(x) ={f Into — cor a 


d(x,y) =d,(x,y) + d,,(%,y), where x = = 

(a) Is it true that as we make the error measured in terms of one of the metrics 
smaller and smaller that the error measured in terms of the others must 
become smaller and smaller? 

(b) Let the topology generated on C’[0,7] by the metrics be denoted 7,,, 
JT .,7 3,7. Which topologies are commensurable? If two topologies 
are commensurable, which is the stronger one? [Hint: See Exercise 8, 
Section 8.] 


4. Carry out a development similar to the one of this section for the pseudo- 
metric spaces. 


5. Animportant problem in modern control theory is the selection of an optimum 
input u(t) from a set Q, of allowable inputs. The set Q is usually determined 
by a constraint on the inputs. For example, u(t) may correspond to thrust, or 
force, and the amount of available thrust or force may be limited, that is, 
|u(t)| < M. Then again u? may correspond to instantaneous power and total 
available energy may be limited, that is, 


T 
| |u(t)|? dt < N. 
0) 


Or u(t) may correspond to fuel rate and the total amount of fuel available may 
be limited, that is, 


f wo dt < F. 
) 


It turns out that problems in this spirit can often be formulated within a 
metric space framework. It then happens that it is important to know whether 
Or not the set of allowable inputs, Q, is an open set. In what follows it is 
assumed that Q is a subset of X¥ = C[0,T]. 

(a) Let X have the sup-metric d,,. Which of the following are open sets? 


(1) Q={ueXx: |u(t)| < M}. 

(2) Q={ueX: |u(t)| <M}. 

(3) Q={ueX: fg |u(t)|? dt< N}. 
(4) Q={ueX: f§ lu(t)|? dt < N}. 
(5) Q={ueX: fb |u(t)| dt < F}. 
(6) Q={uEX:]6 |u(t)| dt < F}. 


3.9. OPEN SETS 91 


(b) Now let X have the metric d,, where 


dy(x,y)? = J x(t) — OP dt. 


Which of the Q’s from (a) is open? 
(c) Now let X have the metric d,, where 


T 
d(x, y= | Iv) — xO) de 


Which of the Q’s from (a) is open? 

[Remark: Amusingly enough, “‘less than” or “‘less than or equal to” in 
the characterization of the above sets often lead to fundamentally different 
situations in optimum control theory. In other words, this is not a mere 
splitting of hairs, Neustadt [1].] 


. Show that a nonempty set A in a metric space (X,d) 1s an open set if and only 


if it is a union of open local neighborhoods. 


. Let 7, and 7, be the two topologies generated by metrics d, and d, on a set 


X. Let I: (X,d,) > (Xd) and I~': (X,d,) > (X,d,) denote the identity maps. 

(a) Show that 7, ¢ 7, if and only if J~' is continuous. 

(b) Show that 7, < 7, if and only if every sequence that is convergent in 
(X,d,) is also convergent in (X,d,). 


8. Prove Theorem 3.9.8. 


9. Let d, be the Euclidean metric on the plane R’, that is, 


d(x,y) = {|x, — 4)? + [x2 — y2|7?P”. 


(a) Show that the mapping (x,,x2) > (),,72) given by 


Vi\ _ (1 O*\(x, 
y2 ~ 0 0 X2 
is not an open mapping of (R?,d,) into (R’,d,). 


(b) Show that the mapping (x,,x,) > y, given by y, = x, iS an open mapping 
of (R’,d,) onto the real line R. 


. Let X= C[0,1] and let 7,,1<p< oo, be the topologies generated by the 


metrics 


1 1/p 
der) =(f I - vera) 1<p<oe 
d,.(x,y) = sup-metric. 


Discuss the commensurability of these topologies. What happens if YX is 
replaced by the Lebesgue space L,,[0,1] or L,[0,1]? 


. Let (X,d) be a metric space. Which of the following metrics are equivalent 


to d? (Prove your assertions.) 


92 TOPOLOGICAL STRUCTURE 


d(x,y) 
(a) d,(x,y) = 1+ d(x,y)’ 
(b) d,(x,y) = min(1, d(x,y)). 
(c) d3(x,y) = sup, <x |d(x,t) — d(y,0)I. 
12. Let A be a finite set in a metric space (X,d). Show that the complement A’ 
1S open. 
13. Let X be a nonempty set and let p be the pseudometric p(x,y) = 0. What is 
the topology generated by p? Compare this with Example 3. 


14. Let J, and J, be the two topologies generated by metrics d, and d, on a set 
X, and let (Y,d,) a metric space. Denote the family of all continuous 
mappings of (X,d;), i= 1,2, into (Y,d,) by C,(Y). Show that C,\(Y) <C,(Y) 
if and only if 7, < J,. 

15. (Continuation of Exercise 14.) Let CS;,i= 1,2, denote the family of all 
convergent sequences in (X,d;). Show that CS, < CS, ifand onlyif. 7, < 73. 


10. MORE ON OPEN SETS 


In this section we shall investigate the general question of equivalence of 
metric spaces. For this study we shall not assume that the metric spaces have the 
same underlying set, as we did in the last section. Equivalence will be defined, as 
was done above, in terms of the concepts of continuity and convergence. We shall 
see that the topological structure of a metric space again plays a key role. 

Let (X,d,) and (Y,d,) be metric spaces, and assume that the points in (X,d;,) 
can be placed into a one-to-one correspondence with the points in (Y,d,). That is, 
assume there is an invertible mapping G of X onto Y. 

If f maps (X,d,) into a metric space (Z,d;), then fG~' maps (Y,d,) into 
(Z,d,). Conversely, if h maps (Y,d,) into (Z,d;), then hG maps (X,d,) into 
(Z,d,). (See Figure 3.10.1.) In fact, G puts the functions defined on (X,d,) into 
one-to-one correspondence with the functions defined on (Y,d,). 

Similarly, if {x,} is a sequence in (X,d,), then {y,} = {G(x,)} 1S a sequence in 
(Y,d,). Conversely, if {y,} is a sequence in (Y,d,), then {x,}={G ‘(y,)} is a 
sequence in (X,d,). Moreover, G puts the sequences in (X,d,) into one-to-one 
correspondence with the sequences in ( Y,d,). 


Figure 3.10.1. 


3.10. MORE ON OPEN SETS 93 


The following two statements are the natural extensions of statements (a) 
and (b) of the foregoing section. 


(a’) Let f be a mapping of (X,d,) into an arbitrary metric space (Z,d,). The 
mapping f: (X,d,) > (Z,d;) is continuous if and only if the mapping 


h = fG~": (Y,d2) > (Z,ds) 


is continuous. 
(b') A sequence {x,} in (X,d,) converges toa point Xo if and only if the sequence 
{G(x,,)} in (Y,d,) converges to G(Xo). 


The mapping G is taking the role of the identity J in the preceding section. 
The reader will recall the importance of the continuity of J and J~'. This fact 
motivates the following definition. 


3.10.1 DEFINITION. A mapping G: (X,d,) > (Y,d,) is said to be a homeo- 
morphism if (i) G is invertible and (ii) both G and G~' are continuous. (Note that 
G is a homeomorphism if and only if G~* is a homeomorphism.) 


3.10.2 THEOREM. Let (X,d,) and (Y,d,) be metric spaces, and let G be an 
invertible mapping of (X,d,) onto (Y,d,). Then the following statements are equiv- 
alent: 


(1) The mapping G is a homeomorphism. 
(2) Statement (a’) is true. 
(3) Statement (b’) is true. 


The proof of this theorem is left to the reader. 

What does this theorem say? The mapping G puts the points of (X,d,) into 
one-to-one correspondence with the points in (Y,d,). Moreover, and this is the 
important point, since G is a homeomorphism, it also puts the open sets of (X,d,) 
into one-to-one correspondence with the open sets of (Y,d,). If we view the mapping 
G as a renaming of points, we see that the metric spaces (X,,d,) and (Y,d,) differ 
in the names given to their points but not in topological structure. Obviously, the 
‘names given to points” is unimportant as far as continuity and convergence are 
concerned. 

The next definition is the key concept introduced in this section. 


3.10.3 DEFINITION. Two metric spaces (X,d,) and (Y,d,) are said to be 
homeomorphic (to one another) if there exists a homeomorphism mapping one of 
them onto the other. 


Note that homeomorphic spaces (X,d,) and (Y,d,) can easily have many 
homeomorphisms mapping (X,d,) onto (Y,d,). Definition 3.10.3 requires only 


94 TOPOLOGICAL STRUCTURE 


that there be at least one. A metric space (X,d) is always homeomorphic to itself, 
for the identity mapping J is always a homeomorphism. However, there are often 
other homeomorphisms of (X,d) onto itself. 

There is, of course, a close connection between the concepts of homeomorphic 
metric spaces and equivalent metrics (Definition 3.9.1). In particular, two metrics 
d, and d, on a set X are equivalent if and only if J, the identity mapping, is a 
homeomorphism. Thus, d, and d, are equivalent if and only if the metric spaces 
(X,d,) and (X,d,) are homeomorphic in a special way, the special way being that 
I: (X,d,) > (X,d,) is a homeomorphism. Note that this does not rule out the 
possibility that (X,d,) and (X,d,) may be homeomorphic while J is not a homeo- 
morphism. 

If two metric spaces are homeomorphic to one another, it 1s clear that any 
property that can be characterized in terms of open set structure only is possessed 
by both or neither of them. We will see subsequently that separability and com- 
pactness are examples of such properties. 


3.10.4 DEFINITION. A property P is said to be a topological property (topo- 
logical invariant) if whenever a metric space (X,d) possesses P every metric space 
homeomorphic to (X,d) possesses property P. 


It should be noted that there is no requirement that distance be preserved by 
a homeomorphism. It can easily happen that d,(x,y) 4 d,(Gx,Gy), where G is 
a homeomorphism. Since we are often interested in homeomorphisms which do 
preserve distance, we introduce the following concept: 


3.10.5 DEFINITION. A mapping G of (X,d,) onto (Y,d,) is said to be an 
isometry if d,(x,y) = d,(Gx,Gy) for every pair x and y in (X,d,). In this situation, 
we Shall say that (X,d,) and (Y,d,) are isometric (to one another). 


We leave it to the reader to show that an isometry is a homeomorphism. 

If two metric spaces are isometric, they can be viewed as being essentially 
the same metric space. Again, the “‘ names given to points” are different, but these 
names are unimportant for questions of distance as well as questions of continuity 
and convergence. 

Let us end this section by delving a bit into the structure of topologies them- 
selves. Suppose a set XY and a class of subsets «< of X are given. Is there a metric 
on X such that is the topology generated by the metric? One can give an answer 
to this question, but the answer is beyond the scope of this book.” However, the 
following theorem presents an important part of the answer. 


3.10.6 THEOREM. Let (X,d) be a metric space and let Z be the topology 
generated by d. Then the following statements are valid: 


5 An answer can be found in Kelley [1; pp. 124-130]. 


3.10. MORE ON OPEN SETS 95 


(1) @, the empty set, and X are in 7. 
(2) If A, is any collection of open sets, then \_),A, is an open set. 
(3) If Ay, ..., A, is a finite collection of open sets, then ( \?_, A; is an open set. 


The proof is left as an exercise. 

We see, then, that a topology on a metric space is closed under arbitrary unions 
and finite intersections. This may appear to be strangely asymmetrical. However, 
a simple example will illustrate what can happen. Consider the class of open local 
neighborhoods B,,,(x),n = 1,2,.... These open local neighborhoods are open 
sets, and the intersection { ), B,/,(x) is simply the point set {x}, which is generally 
not an open set. (Give an example of a metric space where {x} is not an open set.) 

The converse of Theorem 3.10.6 is not true. That is, if we are given a collection 
of subsets . of X that is closed under arbitrary unions and finite intersections and 
that contains @ and_X, it is not necessarily true that is the topology generated 
by some metric on X. For a detailed discussion of a converse for Theorem 3.10.6 
we again refer the reader to Kelley [1; pp. 124-130]. 


EXERCISES 


1. Carry out a development for pseudometric spaces similar to the one carried 
out in this section for metric spaces. 


2. Prove Theorem 3.10.6. 


3. Give an example of two metric spaces with the same underlying set, say 
(X,d,) and (X,d,), that are homeomorphic to one another but for which the 
metrics d, and d, are not equivalent. [Hint: One approach is by way of product 
spaces. | 


4. Sometimes a metric space is made up of “more than one piece.” More pre- 
cisely, a metric space (X,d), for that matter any topological space, is said to be 
disconnected if it is the union of two open, nonempty, disjoint subsets; that is, 
if there exist open subsets A and B of (X,d) such that A, BAO, AN B=Q, 
and X= A U B. A metric space (X,d) 1s said to be connected if it is not dis- 
connected. 

(a) Give an example of a disconnected metric space. 
(b) Give an example of a connected metric space. 


5. Show that connectedness is a topological property. 


6. Let f: X > Y be continuous, where (X,d,) and (Y,d,) are metric spaces. Let 
A be a connected set in X, that is, the metric space (A,d,) is connected. Show 
that f(A) is connected in Y, that is, (f(A),d,) is a connected space. 

7, A set Bin a metric space (X,d) is said to be contractable (to a point x.) if there 
is a continuous mapping 


F(x,t): B x I B, 


96 


10. 


Il. 


12. 


13. 


TOPOLOGICAL STRUCTURE 


with the following properties: 

(a) F(x,0) =x, for all x in B. 

(b) F(x,1)=x,, for all x in B. 

(Here J = {t:0<+¢ <1}, and B x J has the metric 


A'((X 1,01), (X2 .f2)) = A(X 1,X2) + [ty — tol.) 


Show that contractibility is a topological property. 


. Use Exercise 7 to show that a circle in the plane is not homeomorphic with a 


line segment. 


. Suppose that F is a one-to-one mapping of a set X onto a metric space (Y,d,). 


(a) Does d,(x,,x.) = d,[F(x,), F(x,)] define a metric on X? 
(b) If so, is F a homeomorphism? Anything stronger ? 


Let (X,d) be a metric space, where X is nonempty, and let Y= BC(X,R) 
denote the collection of all bounded, continuous real-valued functions defined 
on X. Assume that Y has the metric 


o(f,g) = supt |f(x) — g@)|: xe X}, 


see Exercise 6, Section 5. Let x be a fixed point in X and define 


FAX) = d(x,y) ~~ d(x,Xo). 


Show that the mapping G: y ~/f, is an isometry from X onto a subspace of Y. 


(Continuation of Exercise 10.) Assume that X = R with the usual metric. 
Sketch a few of the functions /,. 

(Continuation of Exercise 9, Section 5.) Show that the mapping x > 6, of X 
into LC,,* is an isometry. [Hint: Compute |6,(f) — 6,(f)| and ||f|| where 
F(z) = 4[d(x,z) — d(y,z)] + 3d y,x0) — d(x, Xo)]. 

Let f: (X,,d,) ~(X,,d,) be a homeomorphism of X, onto X,, and let 
g:(X,,d,) > (X3,d,). Show that g is continuous if and only if 


Of (X14) > (X3 ,d5) 


is continuous. What happens if fis not a homeomorphism? 


. Let X = C(a,b) be the space of continuous real-valued functions defined on 


the open bounded interval (a,b). For x, ye X let 


D(x,y) = {t € (a,b): x(t) # y(t)}. 


(a) Show that D(x,y) is the union of disjoint open intervals. 

(b) Let d(x,y) denote the sum of the lengths of these intervals. Show that d 
is a metric on X. 

(c) Find a sequence of bounded functionsin X that converge to x(t) = (t — a)~'. 


3.11. EXAMPLES OF HOMEOMORPHIC METRIC SPACES 97 


11, EXAMPLES OF HOMEOMORPHIC METRIC SPACES 
In this section we present several examples to illustrate the concepts of 


homeomorphism and isometry. 


EXAMPLE 1. On the set ¥ = R’, define the metrics 
d,(x,y) = |x, — yy| + [x2 — yal 
d,(x,y) = (|x, —- yl? + |x2 — yal 
d..(x,y) = max ( |x, — Vl; |x. a y2]). 


We will show that the three metrics d,, d,, and d,, are equivalent on R?. 


ey hi2 


3.11.1 LEMMA. With d,, d,, and d,, defined as above one has 


(a) d,(x,y) < d,(x,y) < d,(x,y), 
(b)  d,(x,y) < ./2d,,(x,y), 
(c) d,(x,y) Ss \/2d,(x,y) < 2d,.(x,y), 


for all x, y in R’. 


Proof: Note that d,(x,y)>|x,—y,| and similarly d,(x,y) > |x, —y,]|. 
Consequently d,(x,y) is greater than the largest of these two numbers, that is, 
d,(x,y) > d,,(x,y). On the other hand, d,(x,y)? = |x, — y,|? + |x2 — y2|? < d,(x,y)’, 
so d,(x,y) < d,(x,y). This completes the proof of (a). 

To prove (b) we first note that d,,(x,y)? = max{|x, — y,|?, |x. — y2|7}. Then 


d,(x,y)” Ss 2 max{ [x4 7 yl’, [x2 — y2l"} SO d,(x,y) S ./2d..(x,y). 
To prove (c) we need an algebraic lemma® namely 2ab < a* + 5? for all real 
numbers a and b. Applying it here we get 


d,(x,y)? = |x, — yl? + [x2 — yal* + 21x, — il x2 — yo 
< 2|x, — y|7 + 2|x2 — y2|? = 2d,(x,y)’. 
Hence, d,(x,y) < f 2 d,(x,y). J 


3.11.2 THEOREM. The metrics d,, d,, and d., are equivalent on R*. (Hence 
the metric spaces (R*,d,), (R?,d,), and (R’,d,,) are homeomorphic.) 


Proof: Weshalluse Theorem 3.9.2. Let {x,} be a sequence in R?. By applying 
the last lemma to d,(Xo,x,), @2(Xo ,X,), and d,,(Xo ,x,,), we see that 
lim x, = Xo in (R?,d,,.)<> lim x, = Xo in (R?,d,) 
<> lim Xn = Xo in (R?,d,). | 


© The proof of this is simply to observe that a? — 2ab + b? = (a— 6)? >0. 


98 TOPOLOGICAL STRUCTURE 


EXAMPLE 2. On the set X = R”, define the metrics 
d,(x,y) i "A 


L 


d(x,y) =| 


|x; oe yils 


— 


iM 


1/2 
|x; - vi? | ) 


i 
n 1/p 
d,(x,y) = [> xi — vi| » Lsp<o, 


d (x,y) = max |x; ~ yil . 


1<i<n 


We ask the reader to modify the proof of Lemma 3.11.1 and Theorem 3.11.2 to 
establish the following facts. Further, see Exercise 1. 


3.11.3 LEMMA. With d,, d,, and d,, defined on R" as above, one has 


(a) (x,y) < d,(x,y) < dy(x,y) 
(b) d,(x,y) < Vn d,(x,y) 
(c) d,(x,y) < Vn d,(x,y) 


for all x, y in R". 
3.11.4 THEOREM. The metrics d,, d,, and d,, are equivalent on R". J 


EXAMPLE 3. Let R” be given with the metric 
dy(x,y) = (lx, — val? +77 + 1% — Yal?)"?. 
Let Y be the set made up of all functions y of the form 
y(t) = a, cos t + a, cos 2t + +++ +4, cos nt, 


where 0 <¢ < 27 and a,,..., a, is any n-tuple of real numbers. Define a metric 
on the set Y by 


avin = (fbi val? ath 


Let G be the mapping of CX,d,) onto (Y,c,) defined by 
G{xX1,...,X,} = xX, cost + +++ + Xx, cos nt. 


We leave it to the reader to show that G is invertible and that its inverse can be re- 
presented by 


2 


1 2n 1 nt 1 2n 
G'\(y)= - i y(t)cos t dt, - i y(t)cos 2t dt, ...,- | y(t)cos nt ar} 
To TO 1-0 


Furthermore, we claim that G and G™' are continuous. See Exercise 5. Jj 


3.11. EXAMPLES OF HOMEOMORPHIC METRIC SPACES 99 


EXAMPLE 4. Let X =/1,(—00,00) and Y = L,[0,1] be given with the usual 


metrics. It will be shown in Example 2, Section 5.19, that these two metric spaces 
are homeomorphic. This is one of the most important examples of homeomorphic 
metric spaces. J 


EXERCISES 
1. Define d,(x,y) on R? by 


Mm fb WwW N 


d,(x,y) = { |x, — yy]? + [x2 - yl, l<p<o. 
Show that 
d,.(x,y) < d,(x,y) < d,(x,y) 
d(x,y) < J 2d..(x,y) 
d,(x,y) < /2d,(x,y) 


for all x,y in R?. [Hint: Let 0<A<1 and set F( p)=A?+(1—A)?. Show 
that F( p)=1 for 1=p<oo by computing dF/dp.] 


. With d, as given in Exercise 1 show that d, is equivalent to d, on R’. 
. Prove Lemma 3.11.3. 

. Prove Theorem 3.11.4. 

. The following questions refer to Example 3. 


(a) Show that G is a homeomorphism. 

(b) Is G the only homeomorphism between (X,d,) and (Y,a,)? 

(c) Is G an isometry? If not, are there any isometries between (X,d,) and 
( Y,02)? 

(d) Is the mapping H: X > Y given by 


H{x,,...,X,} =x,2 cost +--+ + .x,° cos nt 


a homeomorphism ? 


. Let X be the set of all ordered n-tuples of 1’s and 0’s with the metric given in 


Example 17, Section 3. 
Let W be a set containing n elements and let Y be the set of all subsets of 
W. Let A and B denote arbitrary elements in Y, and define a metric on Y by 


d,(A,B) = number of elements of W in the subset A AB, 


For example, in Figure 3.11.1, one has d,(A4,B) = 3. 

(a) Show that (X,d,) and (Y,d,) are isometric. 

(b) Which invertible mappings of (X,d,) onto (Y,d,) are isometries ? 

(c) How many isometries are there of (X,d,) onto (Y,d,)? How many open 
sets are there in each of these topologies? If a metric space (Z,d3) is 
homeomorphic to (X,d,), how many open sets are there in the topology 
generated by d,? 


100 TOPOLOGICAL STRUCTURE 


Figure 3.11.1. 


7. Suppose that X and Y are finite sets containing exactly four elements, that is, 
X = {X1,X7,X3,X4} and Y= { yj, >, ¥3, ¥4}. Suppose we define metrics d, and 
d, on X and Y, respectively, with the aid of the two diagrams in Figure 3.112. 
In both diagrams the distance between adjacent nodes is 1. The distance 
between nonadjacent nodes is equal to the fewest number of branches that 
need be traversed in going from one node to the other. For example, 


A\(X1,X%2) = 1,d)(%1,%4) = 3,420), 2) = 1d, Ya) = 2. 


Figure 3.11.2. 


(a) Are the metric spaces (X,d,) and (Y,d,) homeomorphic to one another? 

(b) Are the metric spaces (X,d,) and (Y,d,) isometric to one another? 

(c) How many one-to-one mappings of (X,d,) onto (Y,d,) are there? How 
many, if any, of them are homeomorphisms? Isometries ? 

(d) What are the topologies generated by d, and d,? 


8. Is it true that if two metric spaces have finite underlying sets with the same 
number of elements, then they are homeomorphic? Are they isometric ? 


9. Let (X,d,) be the metric space 
{R*, (x,y) = [x — yi + [x2 — yal }. 


Let Y be the set of all functions y of the form y(t) = a, + at where a, and a, 
are arbitrary real numbers and 0 <¢ < |. Define a metric on Y by 


3.12. CLOSED SETS AND THE CLOSURE OPERATION 101 


_ (0, ify. =y.0) 
dy(Vi,Y2) = 1, otherwise. 


(a) What is the topology on Y generated by d,? 

(b) Which mappings of (Y,d,) into (Y,d,) are continuous? 

(c) Which mappings (X,d,) into (Y,d,) are continuous? 
[Hint: Consider constant mappings first.] 

(d) Are any of the mappings in (c) invertible? 

(e) Show that (X,d,) and (Y,d,) are not homeomorphic. 

10. (Continuation of Exercise 3, Section 4.) Let Q* denote the positive rational 
numbers with the usual metric. Discuss the mapping ¢: N x N- Q™ given 
by (m,n2) = nn, *. 

11. Consider the interval [0,1] with the usual metric. We say that two homeo- 
morphisms fo, /, from [0,1] into [0,1] are homotopically equivalent if there is 
a continuous mapping F(t,s): [0,1] x [0,1] — [0,1] such that for each s, F(-,s) 
is a homeomorphism of [0,1] into itself and F(t,0) = fO(t) and F(t,1) = f,(¢). 
Show that every homeomorphism of [0,1] into itself is homotopically equiva- 
lent to either g(t) = t or h(t) = | — t. Show that g and / are not homotopically 
equivalent. 


12, CLOSED SETS AND THE CLOSURE OPERATION 


In this section we shall study the properties of closed sets, which we now 
define. 


3.12.1 DEFINITION. Let (X,d) be a metric space. A subset A c YX is said to 
be closed if its complement A’ = X — A is an open set. 


Since X’ = @& and @’ = X and since @ and X are both open sets, they are 
also closed sets. At first it may be surprising, but it is nevertheless true that sets 
can be both open and closed. (See Exercise |.) Also note that it is possible for 
a set to be neither open nor closed. 


EXAMPLE 1. Let X be given by 
X={x:0<x<1 or 2<x<3} =[0,1] vu [2,3]. 


Let d(x,y) = |x — y|. It is easily shown that the set A = {x:0< x< 1} 1s both 
an open and a closed subset of the metric space (X,d). Of course, A is not an open 
subset of (R,d), the real line with the usual metric. This latter fact shows that we 
have to be careful to state exactly which universe space X is being considered. J 
In Section 11, we referred to “‘closed balls’? and ‘‘closed local neighbor- 
hoods.”’ Let us now prove that these ‘“‘closed’’ sets are closed in the technical 
sense defined above. 


102 TOPOLOGICAL STRUCTURE 


3.12.2 LEMMA. Let(X,d) bea metric space and B,[xo] = {x € X: d(x,Xo) < r}, 
SX) = {x € X: d(x,x9) =r}. Then B,[ x9] and S,(xo) are closed sets forO <r < o. 


Proof: First consider B,[x 9]. We must show that the complement B,[x9]’ = 
X — B,[ xo] is open. Let ye B,[xo]’. Then d(y,x9) =a >r. Now set B=a—r>0. 
Since 
A(X ,Z) = A(y,Xo) _ d(y,Z) cee a d(y,z), 


we see that if d(y,z) < B, then d(x9,z) >a —f=r. In terms of local neighbor- 
hoods this means that B,(y) < B,[xo]’, see Figure 3.12.1. Hence B,[x9]’ is open. 


B,[x9] 


Figure 3.12.1. 


Since S.(Xo)’ = (B,(%9) U B,[xo]’) is the union of two open sets, it is open and 
S(X%9) is closed. J 


If F denotes the class of all closed sets in a metric space (X,d), then by using 
Theorem 3.10.6 and De Morgan’s formulas for complementation we have the 
following theorem. 


3.12.3 THEOREM. Let ¥ denote the class of all closed sets in a metric space 
(X,d). Then 


(1) @, the empty set, and X are in F. 
(2) If A, is any collection of closed sets, then (\, A, is closed. 
(3) If Ay,..., A, is a finite collection of closed sets, then \_); A, is closed. 


Let us consider some examples. 


EXAMPLE 2. Let X = R and d be the usual metric. It follows from Exercise 
9.6, that each closed set in (X,d) is the complement of a union of open intervals. 
In this sense, then, it is ‘‘ easy’ to characterize the class of all closed sets in (X,d). 
Even so, the complement of a union of open intervals can be a rather complex 
object. One classic example ts the Cantor set, which is constructed in Appendix D. 


3.12. CLOSED SETS AND THE CLOSURE OPERATION 103 


We invite the reader to review this construction. For now we merely make two 
observations: 


(1) The Cantor set contains an uncountable number of points. 
(2) The Cantor set does not contain a nontrivial interval. 


It follows, then, that if we try to express a closed set as the union of intervals, 
we might need an uncountable number. This shows that closed sets can be quite 
pathological. J 


EXAMPLE 3. Let d,, be the sup-metric on C[0,7]. Then the set A of all 
x e C[0,7] such that |x(¢)| < 1 for all ¢ is a closed set. In fact, A = B,[0], that is, 
A is just the closed ball of radius 1 centered at x(t) =0. J 


EXAMPLE 4. Consider the metric space (/, ,d,) (see Example 4, Section 3). 
Let A be the Hilbert cube, that is, the set of all points x = {x,,x,,...} in (/,,d,) 
such that |x,| < 1/n. We leave it as an exercise to show that the Hilbert cube is 
closed. J 


The next thing to do is investigate the structure of closed sets. The concept 
of point of adherence will be basic for this investigation. 


3.12.4 DEFINITION. Let A be a subset of a metric space (X,d). A point x 
in (X,d) is said to be a point of adherence of A if each open set of (X,d) containing 
x also contains a point y in A. 

Notice that this is equivalent to saying that each local neighborhood WN of x 
contains a point yin A. It should be noted that we do not ask that the point of 
adherence x be in A. We only ask that the local neighborhoods meet A. It should 
be clear that every point in a set A is a point of adherence of A. 

We sometimes consider another kind of point that is close to a set A. A point 
x that is a point of adherence of A — {x} is said to be a point of accumulation of A. 
It is clear that every point of accumulation is also a point of adherence. 


EXAMPLE 5. Let d be the usual metric on R and let A = (0,1) and B = {0}. 
Clearly, each point in the interval (0,1) along with 0 and | are points of adherence 
of A. In fact, these are all the points of adherence of A. Similarly x = 0 is the only 
point of adherence for B. (What about the points of accumulation?) J 


EXAMPLE 6. Let X be the set of all ordered n-tuples of 1’s and 0’s, and let 
d(x,y) = number of places where x and y disagree. 


Let A be an arbitrary set on the metric space (X,d). Since B,/2(x) = {x}, every 
point of adherence of A lies in A. J 


104. TOPOLOGICAL STRUCTURE 


EXAMPLE 7. Let X = C[0,7] be given with the sup-metric d,, and let A 
be the set in (X,d,,) made up of all functions x such that x(0) =0 and |x(t)| < 1 
for all ¢t. (See Figure 3.12.2.) We note that the point x,.(t) = 1 is not a point of 
adherence of A, since d,(%9,x)=1foreachxeA. J 


+] 


Figure 3.12.2. 


EXAMPLE 8. This example is a variation on the preceding one. Let X¥ = 
C[0,7] with the metric 


dcxy) =] Int) vio? ae] 


and let the set A be defined as above. Now it is true that the point x,(t)=1 isa 
point of adherence of A. (Why?) J 


If A is a set in a metric space (X,d), let A denote the set of all points of ad- 
herence of A. The set A is sometimes referred to as the closure of A. Let us see what 
can be said about A. 


3.12.5 THEOREM. A set A in a metric space (X,d) is closed if and only if 
A= A, 


Proof: First assume that A = A. We want to show that the complement A’ 
is open. So let x e A’. Since x is not a point of adherence of A, there is a local 
neighborhood N of x that does not meet A, that is, N A A = @. Hence A’ 1s open, 
so A is closed. 

Now assume that A is closed. Since the inclusion A c A is always true we want 
to show that A c A. Let x € A and assume that x ¢ A. That is, x is in the open set 
A’, Hence thereis a local neighborhood N of x with N 4 A = ©. But this contradicts 
the fact that x is a point of adherence of A. Therefore one hasxeA. J 


3.12.6 THEOREM. Let A bea set ina metric space (X,d), then 


(1) A =A, and 
(2) A is a closed set. 


3.12. CLOSED SETS AND THE CLOSURE OPERATION 105 


Proof: (1) One always has the inclusion Ac A. Let xe A and let N be an 
arbitrary open local neighborhood of x. In order to show that x € A we must show 
that there is a ze N - A. Since x is a point of adherence of A, there is a point y 
in Nc A. Let M bea local neighborhood of y (see Figure 3.12.3) satisfying M < N. 
(Here we use the fact that N is open.) Since y is a point of adherence of A there is a 
ZEMNACNOA. 

(2) This follows directly from (1) and Theorem 3.12.5. J 


A= (Shaded Region 


Figure 3.12.3. 


The reader should be careful not to confuse the concepts of “‘ point of ad- 
herence”’ and “‘limit of a sequence.”’ Point of adherence is a concept attributable 
to sets. Limit is a concept attributable to sequences. Of course, the range of a 
sequence (that 1s, the set of all elements x, in the sequence) is a set in (X,d), and 
it can have points of adherence. 

Keeping the foregoing warning in mind, it is nevertheless still possible to 
characterize closed sets using the limit concept. 


3.12.7 THEOREM (THE CLOSED SET THEOREM). A set A in a metric space 
(X,d) is a closed set if and only if every convergent sequence {x,} with {x,} < A has 
its limit in A. 


Proof: First assume that A is closed and let {x,} be a convergent sequence 
in A with x) = lim x,. We want to show that x,¢A. Since x) = lim x,, the 
sequence {x,} is eventually in each local neighborhood of x,. But this means that 
Xo is a point of adherence of A. Since A is closed, it follows from Theorem 3.12.5 
that xo € A. 

Let us now show the converse. Assume that every convergent sequence in A 
has its limit in A. Let x9 € A. Then there is at least one point x, in By jn(Xo) OA 
for each n. Since xp = lim x,, we see that x9 € A. Since A contains all of its points 
of adherence, it is closed. J 


106 TOPOLOGICAL STRUCTURE 


The two concepts of closed set and point of adherence lead to other concepts 
which are of use. We rather unimaginatively list them here and ask the reader to 
wade through them on the promise that they will indeed be useful. 


3.12.8 DEFINITION. A set A in a metric space (X,d) is said to be dense in 
(X,d) (or everywhere dense) if the closure of A is X, that is, A = X. 


Intuitively speaking, this says that for each point x in X, there are points in A 
arbitrarily close to x. Obviously, the set X is always dense in (X,d). However, 
there are many situations where a proper subset of a metric space is everywhere 
dense. For example, the rational numbers are dense in the real numbers, with the 
usual metric. 


EXAMPLE 9. Let d,, be the sup-metric on C[0,7]. Let P[0,7'] be the set of all 
polynomials, p(t) = dy) + a,t +--+: + 4,¢", with real coefficients, defined on [0,7], 
n=0,1,2,.... P[0,7] is a subset of (X,d) and P[0,7] is dense in {C[0,T], d@}. 
The fact that this is the case is a consequence of the famous Weierstrass A pproxi- 
mation Theorem, see Exercises 20 and 21. J 


3.12.9 DEFINITION. A metric space (X,d) is said to be separable if it contains 
a dense set A that is countable. 


For example, the set of real numbers with the usual metric forms a separable 
metric space since the set of rational numbers is dense and countable. Note that a 
separable metric space can easily contain more than one set which is dense and 
countable. 

The next lemma follows from the definition of separable metric spaces. 


3.12.10 LEMMA. A metric space (X,d) is separable if and only if there is 
a countable set {x,} with the following property: For each € > 0 and each x in (X,d) 
there is at least one x, with d(x,,,x) < &. 


This lemma offers an interesting view of separable metric spaces. It says that 
an arbitrarily accurate approximation to any point in the separable metric space 
is contained in the countable set {x,}. 

Let us consider some examples of separable metric spaces. 


EXAMPLE 10. Let X be R” or C" with the metric 


d,(x,y) = xy — y,| 7 aie Xn = Val. 


If we let A denote the set of all points, x =(x,,...,x,) in X with rational co- 
ordinates. A is clearly countable and dense. Hence (X,d) is separable. J 


EXAMPLE 11. Consider /, = /,(0,00) with the usual metric d,. This metric 
space is separable. Let A be the subset of /, made up of all sequences {r,,} which 


3.12. CLOSED SETS AND THE CLOSURE OPERATION 107 


(1) have rational entries only, and 
(2) have only a finite number of nonzero entries. 


First, let us show that A is countably infinite. Let A,, k = 1, 2,..., be the subset 
of A made up of sequences with x, = 0 forn >k and x,_,#0. Then A =|), A,. 
Each set A, 1s countable, that is, A, can be put into one-to-one correspondence 
with the positive integers. (Why?) [A, contains only one sequence: (0,0,...).] 
Let a,(j) denote the element of A, corresponding to the positive integer j, that is, 
cach a,(j) is a sequence. Form the following array: 


a,(1) 
a,(1), > a,(2), a2(3), > a,(4), ... 
a3(1), a3(2), a3(3), soc 
v7 
a,(1),..! 


This array contains every element in the set A exactly once, and using the diagonal 
counting pattern indicated by the arrows we can put A into one-to-one corres- 
pondence with the positive integers. Hence, A is countable. 

Next let us show that A =/,. Let x = {x,,x,,...} be an arbitrary point in 
/,. Since 


roe) 
>. |x;I7 < OO, 
i=1 


it follows that given any ¢ > 0 there exists an integer N such that 


Ix|?7<—. 
ae i 2 
Obviously, there exists rational numbers r,, r,,..., ry such that 


N 62 

»; Ix; -— rl? <=: 
i=1 Z 

Then r = {r,,r2,...5/y,0,0,...} is a point in A and d(x,r) < e. By Lemma 3.12.10 

we see that this space is separable. J 


EXAMPLE 12. The space L,(/) with the usual metric d, is also separable. The 
proof of this follows from the facts that separability is a topological concept 
(Lixercise 15) and that L,(/) is isometric with /, (Section 5.19). J 


KMXERCISES 


1. Show that a metric space (X,d) is disconnected if and only if it contains a set 
other than @ or X which is both open and closed. (For the definition of a 
disconnected metric space see Exercise 4, Section 10.) 


2. What is the closure of the set A in Example 7? 


108 TOPOLOGICAL STRUCTURE 


3. 
4. 


oOo Ow NHN 


I]. 


12. 


. Let {x,,} be a bounded sequence in (R,d) that assumes no value more than a 


What is the closure of the set A in Example 8? 


Show that the space (/,,d,), 1 < p < 00, is separable. (See Example 4, Section 
3.) What happens for p = 0? 


. We introduced the open mapping concept in Definition 3.9.9. There is also a 


closed mapping concept. A mapping F of a metric space (X,d,) into a metric 
space (Y,d,) is said to be a closed mapping if the image of each closed set in 
(X,d,) is a closed set in (Y,d,). Give an example of a closed mapping that is 
not continuous. Give an example of a continuous mapping that is not closed. 
Give an example of a mapping that is closed but not open. Give an example 
of a mapping that is open but not closed. What can be said about a one-to-one, 
continuous, closed mapping of (X,d,) onto (Y,d,)? (Unfortunately, the 
term closed is applied to transformations on normed linear spaces in a slightly 
different sense from the one used here. See Exercise 9, Section 5.6.) 


. Which of the sets in Exercise 5, Section 9 are closed ? 

. Is Theorem 3.12.7 true for pseudometric spaces? 

. Show that diam(A) = diam(A) for every subset A in a metric space (X,d). 

. Show that A = () B where the intersection is taken over all closed sets B with 


Ac B. (Note that A is the smallest closed set containing A.) 


. Prove the following three statements about metric spaces: 


(a) Given any point xX, in a metric space (X,d) and a closed set A with x, ¢ A, 
there exist disjoint open sets 0, and 0, such that x) €0, and AcO,. 

(b) Given any pair of disjoint closed sets, A, and A,, a metric space (X,d), 
there exist disjoint open sets 0, and 0, such that A, c 0, and A,< 0,. (This 
shows that a metric space is normal, see Kelley [l; p. 112].) 

(c) Let U be an open set containing x. Show that there is an open set V con- 
taining x and such that the closure V < U. (This shows that a metric space 
is regular, see Kelley [1; p. 113].) 


Let X be a set with the metric 
_{0, ifx=y 
a(x,y) = ifxF y. 
Show that the metric space (X,d) is separable if and only if the set X¥ is count- 
able. 


Show that the Hilbert cube (Example 4) is closed. 


finite number of times. Show the sequence converges if and only if the set 
{x © X: x =x, for some n} has precisely one point of accumulation. Describe 
lim x,,. 


. Show that a set A in (X,d) is dense if and only if Un A # @ for every non- 


empty open set U. 


. Show that continuous mappings preserve separability. (Hence separability 


is a topological property.) 


3.12. CLOSED SETS AND THE CLOSURE OPERATION 109 


. Show that the product of two separable metric spaces is a separable metric 
space. Show that any subspace of a separable metric space is separable. 


. In Example 10 it was shown that (R",d,) and (C",d,) are separable. What 


= 
oo 


9 


eae 


20 


other metrics on these spaces would yield a separable metric space? (Use 


Exercise 15.) 
. Show that d(x,A) = 0 if and only if xe A. 


. Show that B(x) < B[x]. What can one say about equality? [Hint: Consider 


Y=([01]v [223],x=lr=1] 


. (Bernstein Polynomials.) Let fe C[0,1] and define 


Bos N= ¥ s(-)()t0 9 


! 


where (?) = EET is the binomial coefficient. 


(a) Show that 


1 -> (”) x1 — xyr*, 


(b) Show that 


(c) Choose M and 6(e) so that 


(;)#a — xy" 


(jeer 


If@)| <M, O<x<l 
f(x) —f()| < @, when |x — y| < 6(e). 


(d) Let x be fixed and fix n so that 


n > max{1/6(e)*; M?/e7}. 


Now define 


S={k:0<k<n and 


L={k:0sksn and ( 


k 
x—-|<n "4 
n 


110 TOPOLOGICAL STRUCTURE 


21. 


22. 


23. 


24. 


25. 


(e) Show that 


n 


&s io ~ (7) (7) a — x)" <e. 


(f) Show that 


m 


keL 


ros) ()ee-oresang 8 


(g) Show that B(x; f) > f(x) uniformly for 0 < x < 1. 

Use Exercise 20 to prove the Weierstrass Theorem: Let f be a continuous real- 
valued function on a compact interval J. Then there are polynomials P, on I 
such that P, > f uniformly. 

Show that the polynomials in Exercise 21 can be chosen to have rational 
coefficients. Hence the space C(/), with the sup-metric, is separable. 

Let X be the space of all complex-valued functions x(t), —0o0 <t< o, such 
that 


lim a |x(t)|? dt < 

im — x 00. 
T+0 214-7 

Let the metric on X be given by 


a(xy) = {tim = f © b= 90P al 


Show that (X,d) is not separable. [Hint: Compute d(e', e'’') where a, b are 
real numbers.] 


Let A be a set in a metric space (X,d). A point x is said to be an interior 
point of A if A contains an open local neighborhood of x. Let Int A (interior 
of A) denote the collection of all interior points of A. Let Ext A = Int A’. Let 


Bdy A = (Int A u Ext A)’. 


Ext A is called the exterior of A and Bdy A the boundary of A. 

(a) Show that A is open if and only if A = Int A. 

(b) Show that x € Bdy A if and only if every open set containing x meets both 
A and A’. 

(c) Show that A is closed if and only if A = Int A U Bdy A. 

(d) Show that Bdy (A) = Bdy (4’). 

(e) Show that for any set A, Int A and Ext A are open sets and Bdy A is a 
closed set. 

(f) Is it possible to have A = Bdy A? 

Let F: (X,d,) > (Y,d,). Show that F is continuous if and only if the inverse 

image of every closed set in ( Y,d,) is a closed set in (X,d,). 


3.12. CLOSED SETS AND THE CLOSURE OPERATION 111 


26. Let QO denote the set of rational numbers in R, where R has the usual metric. 
Discuss the interior, boundary, and exterior of Q and Q’. 

27. Let C be the Cantor set in [0,1], see Appendix D. Describe Int C, Ext C, and 
Bdy C. 

28. Let {x,} be a sequence in a metric space (X,d) with lim x, = x9. Assume that 
d(x,x,) <¢é for all n. Show that d(x,xo) <«. [Hint: Use the fact that the 
interval [0,¢] 1s closed in R.] 


29. Let A be a nonempty set in a metric space (X,d) and define 
o(A;x) = inf{d(x,y): y € A}. 


(a) Show that o(A;x) = 0 if and only if xe A. 
(b) Show that for A fixed o(A;x) is Lipschitz continuous in x. 


Let A and B be nonempty disjoint closed sets in (X,d) and define 


o(A;x) 


1O)= o(A;x) + 6(B3x)’ 


(c) Show that fis continuous and 0 < f(x) < 1 for all x. (Is f Lipschitz con- 
tinuous ?) 

(d) Show that f(x) = 0 precisely on A and f(x) = 1 precisely on B. 

(e) What happens to fif A and B are not closed sets? 


30. (Tietze Extension Theorem.) Let A be a closed set in a metric space (X,d) 
and let f: A [0,1] be a continuous function. The following steps will lead 

to a proof of the fact that fhas a continuous extension g: X > [0,1]. 

(a) Let M=sup{f(x):xeA} and A, ={xeEA: f(x) <4M} and B,= 
{xe A: f(x) > #M}. Use Exercise 29 to show that there is a continuous 
function ¢,: X [0,1] with the property that 4M < ¢,(x) <%M and 
b,(x) = 4M on A, and ¢,(x) = 4M on B,. 

(b) Show by induction that there is a sequence {¢,} of continuous functions 
o,: X > [0,1] such that 


F(x) — Loi) + +++ + bn) I] S Q"M, (xe A) 


lP.COl < GQ)", (xe X). 
[ Hint: Replace f(x) with F(x)=f(x)—$,(x)+ 3M and redo part (a) 
with 
Ay = {x EA: F(x) $(4)(3)M) 
and 


B,= { x EA: F(x) >(3)'M}. 
(c) Show that g, = ¢, + :°: + @, converges uniformly on X, and the limit 
g is an extension of f. 


31. (Continuation of Exercise 30.) What happens if the range of f in Exercise 30 
is a bounded set in R? What happens if fis unbounded ? 


112 TOPOLOGICAL STRUCTURE 


32. Let (X,d) be a metric space. Define a “‘distance”’ p between two bounded, 
nonempty subsets A and C of X by 


p(A,C) = inf{6: A < B;[C] and Cc B,[A]}, 


where B,[A] = )x-4 Bs[x]. Show that p defines a pseudometric on thecollection 
X of all bounded, nonempty subsets of X. Show that p(A,A) = 0. Show that 
p({x},{y}) = d(x,y), where {x} and {y} denotes the subsets containing exactly 
the points x and y, respectively. 


33. Show that the interior of the Hilbert cube (Example 4) is empty. 


13. COMPLETENESS 


Consider the metric space X consisting of all points in the half-open interval 
(0,1J—that is, X = {xe R:0 <x < 1}—with the usual metric. The sequence 
{1/n} = {1,4,4,...} lies in X. At first glance one might say that this sequence 
converges to 0. However, 0 is not a point in the space X. Therefore, it is nonsense 
to say that 0 is the limit. This particular sequence in X is just not convergent. On 
the other hand, we still feel that there is something special about this sequence and 
something special about the manner in which it fails to have a limit. What is 
special is that the sequence is a “‘Cauchy sequence’”’ and the metric space X is not 
‘““complete.’’ We investigate these concepts in this section. 

First let us look at Cauchy sequences. 


3.13.1 DEFINITION. A sequence {x,} in a metric space (X,d) is said to be a 
Cauchy sequence if for each ¢ > 0 there exists an N such that d(x,,,x,,) < € for any 
choice of n, m > N. Notice that N depends on «. 


Another way of stating this definition is 


lim d(x, ,Xm) = 9. 


n,m— co 


However, in this case it should be noted that this is a double limit and its precise 
meaning is really contained in the statement of Definition 3.13.1. For example, 
one could have lim,..,, d(x, ,X2,) = 9, fora sequence {x,} that is not a Cauchy 
sequence. 

It can be seen now that in the example given above {1/n} isa Cauchy sequence. 
Let N = 2/e.Ifn, m> N, then 1/n < e/2and 1/m < ¢/2. Consequently, |1/n — 1/m| < 
I/n+1/m<eforalln,m>N. 

The following lemma presents the key connection between convergent sequences 
and Cauchy sequences. 


3.13.2 LemmMA. Let {x,} be a convergent sequence in a metric space (X,d). 
Then {x,} is a Cauchy sequence in(X,d). 


3.13. COMPLETENESS 113 


Proof: Let Xo be the limit of the sequence {x,} in (X,d). Then for all n and 
m one has 


A(x, Xm) < A(X, Xo) + A(X Xm) 


by the triangle inequality. Since x, is the limit of {x,}, given ¢ > 0 there is an N 
such that n,m > N implies that d(x, ,x9) < ¢/2 and d(x,,x,,) < 6/2. But then by 
the above inequality one has d(x,,,x,,) < € whenever n,m> N. Hence, {x,} is a 
Cauchy sequence. J 


We have thus seen that every convergent sequence is a Cauchy sequence, 
and we have seen an example of a Cauchy sequence that is not a convergent 
sequence. That is, being a Cauchy sequence does not imply that a sequence is 
convergent. 

The example above suggests that the reason some Cauchy sequences fail to 
converge is a fault of the underlying set X. That is, in some sense a metric space 
(X,d) may have a “‘hole”’ init. In the example cited above, the point 0 is ‘‘ missing.” 
If we were to ‘“‘add”’ this point, then the sequence {1/n} would converge. In the 
new space [0,1] = X u {0} with the usual metric, it can be shown that every 
Cauchy sequence is convergent. That is, in the larger space a sequence is con- 
vergent if and only if it is a Cauchy sequence. 

Metric spaces possessing this property—and many do—are so important that 
they are given a name. 


3.13.3 DEFINITION. A metric space (X,d) is said to be complete if each 
Cauchy sequence in (X,d) is a convergent sequence in (X,d). 


It is difficult to overemphasize the importance of complete metric spaces. In 
many applications it is easier to show that a given sequence is a Cauchy sequence 
than to show that it is convergent. (The reason should be evident. The Cauchy 
test involves looking at the given sequence only, whereas the convergence test 
requires information outside the sequence, namely, the limit of the sequence.) 
However, if the underlying metric space is complete, showing that a sequence is a 
Cauchy sequence is enough. For example, many of the tests for convergence of 
sequences of real numbers are really tests for Cauchy sequences. In Section 15 we 
shall give one illustration of the importance of complete metric spaces, and this 
example by itself would be enough to justify giving attention to complete metric 
spaces. However, the primary significance of this concept, from our point of view, 
will be discussed in Chapter 5. 

Let us consider some examples. 


EXAMPLE |. The space of rational numbers with the usual absolute value 
metric is not a complete metric space. For example, the sequence {3, 3.1, 3.14, 
3.141,3.1415,...} is a Cauchy sequence but it is not convergent in the present 
metric space, for z is not a rational number. J 


114 TOPOLOGICAL STRUCTURE 


EXAMPLE 2. The real numbers with the usual metric form a complete metric 
space. We assume that the reader has seen a development of the real number 
system. If not, we refer the reader to Rudin [1; pp. 1-47] for this development. J 


EXAMPLE 3. The complex numbers with the usual absolute value metric 
form a complete metric space. This fact follows directly from the completeness of 
the real numbers. J 


3.13.4 Lemma. Let {x,} be a Cauchy sequence in(X,d). Then the set {x1,X2,...} 
is bounded. 


Proof: With ¢=1 we can find an integer N such that d(x, ,x,,) < 1 when- 
ever n, m> N. Now set 
B = max{d(x1,X2), d(%1,X3),.. +, UA(x1,Xy)}. 
Then it is easily verified that d(x,,x,) <(B+ 1) for all x. It follows from the 
triangle inequality that d(x;,x;)<2(B+ 1) foralliandj. J 


EXAMPLE 4. We now show that the metric space (/,,d,), 1=p< oo, is 
complete. (See Example 4, Section 3.) For each n let x(n) denote an element of /,, 
that is, 


x(n) = {x,(1),x2(n),...}- 
Assume that {x(n)}, is a Cauchy sequence in (/,,d,). Given any ¢ > 0, there exists 
an N(e) such that 
00 1/p 
d,(x(n), x(m)) = 2 |x,(n) — s(n) <e forn,m> N(e). 


It follows that for n,m > N(e) one has 
|x,(”) — x,(m)| < d,(x(n), x(m)) < € 


for all k. But then, holding k fixed, the sequence of real or complex numbers 
{x,(1)} 1s a Cauchy sequence. Since the real numbers and the complex numbers 
are complete, the sequence {x,(”)} converges to a number x,(0). Let x(0) denote 
the sequence {x,(0)}. We show that x(0) is in (/,,d,,) and lim x(m) = x(0) in this 
metric space. Since {x(7)} is a Cauchy sequence, Lemma 3.13.4 implies that there 
is a B such that 


00 1/p 
4,(0.x(n)) =|. ba(nyp} << B< oe 
k=1 
for all n. This implies that 
K 1/p 
[> sor] <B 


for all m and each integer K. Since lim,..,, x,(”) = x,(0), one has 


K \/p 
|» (or <B (3.13.1) 
k= 1 


3.13. COMPLETENESS 115 


for each integer K. By letting K > +00 in (3.13.1), we see that x(0) = {x,(0)} is 
in (/,,d,). 
Then, since n, m > N(e) implies that 


[> Ixi(n) — s(m| <€é 


for each integer K and lim,,.,,, x,(m) = x,(0), it follows that 


K 1/p 
| d |x(2) - (oP <é (3.13.2) 
k=1 
for each integer K. By letting K—> +00 in (3.13.2) we get 
d,(x(n), x(0)) < 


for all n > Me). That is, lim,..,, x(”) = x(0). Since x(n) is an arbitrary Cauchy 
sequence and we have shown it to be convergent, the metric space (/, ,d,) is com- 
plete. fj 


EXAMPLE 5. Letd,, bethesup-metricon C[0,7]. Let us show that {C[0,T],d,,} 
is complete. Let {x,} be an arbitrary Cauchy sequence in {C[0,7],d,,}. Thus given 
uny ¢ > O, there is an N(¢) such that n, m > N(e) implies that 


[xn(t) — Xm(t)] S doo(X%ns Xm) Se 


for all t. Then for fixed t, the sequence of numbers {x,(t)} converges to, say, 
No(t). Since ¢ is arbitrary, the sequence of functions {x,(-)} converges pointwise to 
« function x,(:). But N(e), being independent of ¢t, implies that {x,(-)} converges 
uniformly to xo(-). But it 1s well known that if a sequence of continuous functions 
{x,(‘)} converges uniformly to a function {x,(-)}, then x (-) is continuous. See 
lixercise 19. Thus every Cauchy sequence in {C[0,T7'],d,,} 1s convergent; hence, 
{C[0,7' ],d,,} is complete. J 


EXAMPLE 6. Let 1 <p< oo and let R, denote the space of all Riemann 
integrable functions x defined on [a,b] with 


b 
| Ix(t)|? dt < co 
und define a metric by 
b 1/p 
doy) ={f Ix — xr ail” 
It is shown in Appendix D that (R,,d,) is not complete. However, it is also shown 
in Appendix D that the Lebesgue space (L,,d,) (see Example 10, Section 3) is 


complete. J 


EXAMPLE 7. Let (X,d) be a metric space and let Y = BC(X,R) be the space 
of bounded, continuous, real-valued functions defined on X. Assume that Y has 


116 TOPOLOGICAL STRUCTURE 


the sup-metric o, see Exercise 10, Section 10. We claim that (Y,o) is complete. 
Indeed if { f,} is a Cauchy sequence in (Y,o), then 


[Pu(%) — Sm) | S OSn > Sem) 


for every xe X. Therefore, for each x € X, the sequence of real numbers {/,(x)} 
is a Cauchy sequence in R. Let f(x) = lim f(x). The rest of the argument now 
follows Example 5 and we omit the details. J 


Let us now look at subspaces of a metric space. If Y < X, where (X,d) is a 
metric space, then (Y,d) is a metric space. We seek conditions under which the 
subspace (Y,d) is complete. It is important to note the meaning of this. That is, 
(Y,d) is complete if every Cauchy sequence in (Y,d) is convergent and (this is the 
important point) the limit is in Y. The following theorem gives the most useful 
result on this question. 


3.13.5 THEOREM. Let (X,d) be a complete metric space and let (Y,d) be a 
subspace of (X,d). Then the subspace (Y,d) is complete if and only if Y is a closed 
set in (X,d). 


[Carefully note that Y is always a closed set in (Y,d), but this does not mean 
that Y is a closed set in (X,d).] 


Proof: First assume that (Y,d) is complete. In order to show that Y is 
closed in (X,d), we will show that Y contains all of its points of adherence (Theorem 
3.12.5). If yis a point of adherence of Y, then each open ball B,,,(y),n = 1, 2,..., 
contains a point y, in Y. Since d(y,,y) < 1/n, {y,} iS a convergent sequence in 
(X,d) converging to y. However, the sequence {y,} is a Cauchy sequence in the 
complete space (Y,d); therefore, {y,} converges to a point y, in (Y,d). Since the 
limit of a sequence Is unique, y = yp or yisin Y. Hence, Y is a closed set in (X,d). 

Now assume that Y is a closed set in (X¥,d). We want to show that (Y,d) is 
complete. Let {y,} be an arbitrary Cauchy sequence in (Y,d). Then {y,} is a Cauchy 
sequence in the complete metric space (X,d), so it converges to a limit y, in X. 
It follows from the Closed Set Theorem that yp € Y. Hence (Y,d) is complete. J 


EXAMPLE 8. Let A be the Hilbert cube in (/, ,d,), see Example 4, Section 12. 
Since (/, ,d,) is complete (Example 4, Section 13) and the Hilbert cube is closed 
(Example 4, Section 12), the subspace (A,d) is complete. J 


EXAMPLE 9. Let d, be the sup-metric on C[0,7]. Let P[0,7] be the subset 
of C[0,7] made up of all polynomials in ¢. The metric space (C[0,7],d,,) is com- 
plete (Example 5, Section 13). However, the subset P[0,7] is not closed. For ex- 
ample, the sequence in P[0,7] given by 


l I I 
hi¢a,lett+e—e,l4+rtae—r"t+—er,... 
{flrl+e14 +5; + +5 are } 


3.13. COMPLETENESS 117 


converges to e', which is not in P[0,7]. Since P[0,7] is not closed, the subspace 
(P[0,T ],d,,) is not complete. J 


We have said that if two metric spaces (X,d,) and (Y,d,) are isometric to one 
another, then they are, except for the names of the points in the underlying set, 
the same metric space. Thus the following theorem should not come as a surprise. 
We leave the proof as an exercise. 


3.13.6 THEOREM. Let (X,d,) and (Y,d,) be two metric spaces that are iso- 
metric to one another. Then (X,d,) is complete if and only if (Y,d,) is complete; 
that is, completeness is preserved by isometries. 


Although Theorem 3.13.6 is not a surprise, the next point may be somewhat 
of a shock. Let (X,d,) and (Y,d,) be homeomorphic to one another. It is not true 
that (X,d,) is complete if and only if ( Y,d,) is complete. That is, homeomorphisms 
do not necessarily preserve completeness. It follows, then, that completeness Is 
not a topological property. But, one may ask, how can this be? After all, complete- 
ness has something to do with convergence, and homeomorphisms preserve 
convergence (Section 7). True, but a homeomorphism does not necessarily pre- 
serve Cauchy sequences! 


EXAMPLE 10. Let X = (0,1] and d, be the usual metric, and let Y = [1,00) 
and d, be the usual metric. The function /: (X,d,) > (Y,d,), where y = f(x) = 1/x, 
is a homeomorphism between (X,d,) and (Y,d,). The sequence {1/n} in (X,d,) 
is a Cauchy sequence. However, the corresponding sequence in (Y,d,), that is, 
{f(1/n)} = {n} is not a Cauchy sequence in (Y,d,). On the other hand, a sequence 
{x,} in (X,d,) is convergent if and only if {/f(x,)} is convergent in (Y,d,), for f 
is a homeomorphism. Finally note that since Y is a closed subset of R the space 
(Y,d,) is complete, whereas (X,d,) is not. J 


Isometries, then, belong to the special class of homeomorphisms that preserve 
completeness; whereas, an arbitrary homeomorphism need not preserve complete- 
ness. On the other hand, it should not be surprising to find homeomorphisms, 
which are not isometries, that do preserve completeness. We can specify a very 
large class of such homeomorphisms. 


3.13.7 DEFINITION. A homeomorphism / is said to be a uniform homeo- 
morphism if fand f—' are uniformly continuous. 


3.13.8 DEFINITION. Two metric spaces (X,d,) and (Y,d,) are said to be 
uniformly homeomorphic (to one another) if there exists a uniform homeomorphism 
mapping one of them onto the other. 


3.13.9 THEOREM. Let (X,d,) and (Y,d,) be uniformly homeomorphic. Then 
(X,d,) is complete if and only if (Y,d,) is complete. 


118 TOPOLOGICAL STRUCTURE 


The proof of this theorem is outlined in the exercises. 

Theorem 3.13.9 shows that uniform homeomorphisms preserve completeness. 
However, it is not true that if fis a homeomorphism between two complete metric 
spaces, then fis a uniform homeomorphism. For example, the real numbers with 
the usual metric, (R,d), is a complete metric space and y = x° is a nonuniform 
homeomorphism mapping (R,d) onto itself. Hence, uniform homeomorphisms are 
not the only homeomorphisms that preserve completeness. 

There is another way to characterize complete metric spaces which is often 
useful. Before stating this characterization, we give a needed definition. 


3.13.10 DEFINITION. A sequence {A,} of subsets of a metric space (X,d) 1s 
said to be a decreasing sequence of subsets if 


A, > Ay D> Az >*** 


3.13.11 THEOREM. Let (X,d) bea metric space. Then the following statements 
are equivalent: 


(a) (X,d) is complete. 
(b) (\r-1,A, contains exactly one point, for every decreasing sequence of non- 
empty closed subsets with diam (A,) ~0 asn— o. 


Proof: (a)=>(b). Suppose (X,d) 1s complete and let {A,} be a decreasing 
sequence of nonempty subsets with diam(A,) > 0 as n— oo. Let A = (\®_, A,. 

If x and y are in A, then d(x,y) < diam(A) < diam(A,) for every n. That is, 
d(x,y) = 0 or x = y. We must now show that A is not empty. Choose any sequence 
{x,} with x, €A,. If m>n, then x,,¢€ A, and d(x, ,x,,) < diam(A,) > 0 as n*% oo. 
Hence {x,} 1s a Cauchy sequence. Since (X,d) is complete, this sequence converges, 
call the limit x). Since A, is closed, it follows from the Closed Set Theorem that 
xX, €A, for every n. Therefore x9 € A, which shows that A = ( \°_, A, contains 
precisely one point. 

(b) > (a). Now suppose that for each decreasing sequence {A,} of nonempty 
closed sets with diam(A,) — 0, the intersection A = ()7_, A, contains exactly onc 
point. We want to show that (X,d)} is complete. We shall outline the argument and 
ask the reader to check the details. Let {x,} be any Cauchy sequence in (X,d). 
Let B, = {X,.Xn+1,---} and A, = B,. Then {B,} is a decreasing sequence and 
{A,} is decreasing by Exercise 9, Section 12. Also diam(A,) = diam(B,) by 
Exercise 8, Section 12. Since {x,} 1s a Cauchy sequence one has 


diam(A,,) = diam(B,) — 0 as n> oo. 


Since A = ()”., A, contains one point xy, one has x, > X9 a8 n— 00. Indecd, 
A(Xo,X,) < diam (A,)-~Oasn>ow. J 


3.13. COMPLETENESS 119 


MXERCISES 


6, 
. Let (X,,d,) and (XY, ,d,) be complete metric spaces. Show that the product 


Carry out a development for pseudometric spaces analogous to the one carried 
out in this section for metric spaces. 


. Show that a Cauchy sequence {x,} in a metric space (X,d) is convergent if 


and only if it contains at least one convergent subsequence. 


. Let (B,d,) be the subspace of (/,,d,) made up of all sequences with only a 


finite number of nonzero entries. Is (B,d,) complete? 


. Prove Theorem 3.13.6. 
. Prove Theorem 3.13.9. [Hint: Let f: X > Y be a uniform homeomorphism 


between (X,d,) and (Y,d,). Show that {y,} is a Cauchy sequence in (Y,d,) 
if and only if {f~'(y,)} is a Cauchy sequence (X,d,).] 


Show that the space (J/,, ,d,,) (see Example 5, Section 3) is complete. 


space (X,d)is complete, where X¥ = XY, x X,andd(x,y) = d,(x1,y,) + d5(x2,y2). 


. Use Exercise 7 to show that the space (R",d,), 1 < p< oo (see Example 2, 


Section 3) is complete. Also show (C",d,) is complete. 


. (Extension of Example 7.) Let (X,d,) and (Y,d,) be two metric spaces. 


Assume that (Y,d,) is complete. Let Z = BC(X,Y) denote the space of 
bounded continuous functions from X into Y. Assume that Z has the metric 


o(f.g) = suptd, (f(x), g(x)): x € X}. 


Show that (Z,c) is complete. [Note: (Y,d,) need not be complete.] 


. Show that the metric spaces in Examples 16 and 17 of Section 3 are complete. 
. Let J be an interval in R. Use the fact that R (with the usual metric) is complete 


to show that J is connected. Also prove the converse, namely, if A is a con- 
nected set in R, then A is an interval. 


. Let f: [a,b] — R be continuous with f(a) < 0 and f(b) > 0. Show that there is 


an x,a<x <b, with f(x) = 0. [Hint: Use Exercise 11 above and Exercise 6, 
Section 10.] 


. Let f,(x) and f,(x) be strictly monotone continuous functions defined for 


O<x<1 with f,(0) =/,(0) =0. Show that there is a unique solution 
(fo ,X1,X2) for the equations 


ro =hi%1), Po = Sa (%2) (3.13.3) 
X, +x, = 1. 


[Hint: Let r vary and let x,(r) and x,(r) be the solution of (3.13.3). Now apply 
Exercise 12 to 


P(r) = X4(r) + x2(r) — 1. 


120 TOPOLOGICAL STRUCTURE 


This number rp arises as a critical feeding rate in a biological problem, see 
Sell and Weinberger [1].] 
14. (Continuation of Exercise 9, Section 5.) Show that the space LC,,,* is complete. 


15. Let g: X > Y be uniformly continuous, where X and Y are metric spaces. 
Show that if {x,} is a Cauchy sequence in X, then {g(x,)} is a Cauchy sequence 
in Y. 

16. Let {x,} and {y,} be two Cauchy sequences in a metric space (X,d). Show 
that {d(x,.,,)} 1s a Cauchy sequence in R, where R has the usual metric. 


17. (Baire Theorem.) Let (X,d) be a complete metric space where X = |)”, A, and 
the sets A, are closed. Show that at least one of the sets A, contains a nonempty 
open local neighborhood. [Hint: Argue by contradiction and construct a 
decreasing sequence of open local neighborhoods B,/2.(x,) such that 
By jan(X,) OA, = OH. Show that {x,} converges and that x = lim x, is not in 
ea A, J 

18. Show that the space C(— 00,00) is complete with the metric 

2 1 sup{ix(t) — y@)|: lt] <n} 


a - 2" 1+ sup{ix(t) — y(I: |t] <n} 


What happens if we replace this metric with 
a'(x,y) = Y 2 min(1, supi|x(4) — y(OI tl <n}? 


19. Let {x,,} be a sequence of real- or complex-valued continuous functions 
defined on an interval J, and assume that {x,,} converges uniformly to a limit 
x, that is, for every e > 0 there is a N such that |x,(t) — x(t)|Se for alln=N 
and all ¢ in J. Show that x(-) is a continuous function. [ Hint: Note that 


|x(t) — x(5)|<|x(t) — x,(t)| +]2n(t) — ¥n(s)| +] xn(s) — x5) |] 


14. COMPLETION OF METRIC SPACES 


This section is devoted to explaining the following assertion: Every metric 
space has a unique completion. 

Let us consider a simple case first. Suppose that (X,d) is a complete metric 
space and that (Y,d) is an arbitrary subspace of (X,d). As has been noted, (Y,d) 
is complete if and only if Y is a closed set in (X,d). In any event, the closure Y is 
a closed set in (X,d), and ( Y,d) is complete. Moreover, Y is dense in ( Y,d). Thus, 
in going from (Y,d) to ( Y,d), we fill in any “holes” that may exist in (Y,d). For 
example, let (X,d) be the real numbers with the usual metric, and let 


Y = {r: ris rational and 0 <r < 1}. 


Then (Y,d) is not complete. However, the closure Y = [0,1], is complete, and Y is 
dense in ( Y,d). 

Obviously the foregoing is a way to ‘“‘complete’’ a metric space that is a 
subspace of a complete metric space. Many times, however, we would like to 


3.14. COMPLETION OF METRIC SPACES 12] 


99 


‘‘complete’’ a metric space that is not specified as being a subspace of some 
complete metric space. A generally applicable concept of completion is given in 
the next definition. 


3.14.1 DEFINITION. Let (X,d,) be a metric space. A metric space (Y,d,) 
is said to be a completion of (X,d,) if 


(1) (Y,d,) is complete, and 
(2) (X,d,) is isometric with a dense subspace (Z,d,) of (Y,d;). 


The situation Is illustrated in Figure 3.14.1. 


S, an isometry Z is dense in (Y, d,) 
( be d,) 
Figure 3.14.1. 


Perhaps the first thing to notice about this concept is that a completion of 
(X,d,) does not necessarily contain (X,d,). This may seem bad, but do note that 
the completion does contain a dense subspace (Z,d,) that is isometric to (X,d,). 
We can think of (Z,d,) as a mere renaming of the points of (X,d,). Moreover, in 
certain cases (X,d,) and (Z,d,) are the same. This is the case in the example with 
which we started this section. 

There are two questions which can be posed. First, which metric spaces have 
a completion? Second, how many “different’’ completions does a given space 
have? The answers are: (1) every metric space (X,d) has a completion and (2) all 
the completions of (X,d) are isometric with one another, that is, the completion 
is essentially unique. 


3.14.2 THEOREM. Let (X,d) be a metric space. Then (X,d) has a completion. 
Moreover, if (Y,,d,) and (Y,,d,) are two completions of (X,d), then (Y,,d,) and 
(Y,,d,) are isometric. 


The traditional method of proving this is outlined in the exercises. The reader 
should also note that the existence of a completion is a direct consequence of some 
earlier exercises, in particular, Exercises 8 and 9 of Section 5, Exercise 12, Section 
10, Exercise 14, Section 13, and Theorem 3.13.5. We ask the reader to verify this. 
The fact that two completions are isometrically equivalent is a direct consequence 
of a more general result, which we now state. 


122 TOPOLOGICAL STRUCTURE 


3.14.3 THEOREM. Let (X,d) be a completion of a metric space (X,d) and let 
g.(X,d) > (Y,0) be a uniformly continuous function, where (Y,o) is a complete 
metric space. Then there is a unique continuous ‘‘ extension” g: (X,d) > (Y,o) of g. 
(See Figure 3.14.2.) 


Figure 3.14.2. 


We shall give the proof shortly. However, the term “‘extension”’ needs a word 
of explanation. It is customary to view the completion of a space as a process of 
adding ideal points. That is, if S: XY X is an isometry that imbeds YX into its 
completion X, then instead of viewing X and S(X) as different spaces one considers 
them to be the same. In this way the points in X — S(XY) are “ideal” points and 
we complete S(XY) by “‘adding”’ them to S(X). 

This process is not as strange as it may sound. In fact, the generalized functions, 
such as the Dirac function, arising in the operational calculus, are examples of 
ideal points which one adds to a space of (ordinary) functions, as we now shall see. 


EXAMPLE 1.’ Let X denote the collection of all functions x in L,(—1,1) 
with the property that {* , |x(t)| dt = 1. We define a metric on X as follows: Let 
{P,}, n=1,2,...,denote the countable collection of all polynomials in ¢ with 
rational coefficients. (Recall that {P,} is a dense set in C[—1,1] with the sup- 
metric, see Exercise 22, Section 12.) Now for x, ye X let 


1 
Pr(X,y) = min(1, i [x = YIP, dt 
ant | 


p(xy) = ¥ 2 "p((x,y). 


We claim that p is a metric on X. [It is easy to show that p is a pseudometric on 
X, and we define a new equality on X by saying x = y if and only if p(x,y) = 0.] 


7 There are many (equivalent) ways of defining the Dirac function. The most common methad is 
to use the theory of distributions, see A. Friedman [1]. Our primary purpose in this example is 
simply to show that the Dirac function can be viewed as an ideal point arising in the completion 
of a metric space. 


3.14. COMPLETION OF METRIC SPACES 123 


Now consider the sequence 


l 
m, |tl<— 
Fir) = 2m 
0, otherwise 


form =1,2,....Weclaim that {f,,} is a Cauchy sequence in this metric. To prove 
this we fix e > 0. Now choose JN so that 
= E 
2" <e+. 
> ] 2 
Now consider the finite set of polynomials {P,,P,,...,Py}. Then for n>m one 
has (see Figure 3.14.3) 


1 —1/2n 1/2m 1/2n 
{ [a= Jul dt = —m | + | |Pidt-+ (a =m) P, dt. 
—1 -1/2m 1/2n —1/2n 
fn 
hi 
P 


2m an In am 


Figure 3.14.3. 


By the Mean Value Theorem for integrals one has points ¢,, ¢, , f; in the intervals 
[—1/2m, —1/2n], [—1/2n, 1/2n], [1/2n, 1/2m], respectively, such that 


n—m 


1 1 
| _ ba — JnlPi dt = —m Ee 7 a [Pi(t,) + Pi(ts)] + P(t) 


2m n 


oe >  [2P(t>) — P(t) — P.(ts)] 
n 


<4|2P(t,) — P(t,) — P(t3)|. 


Since P; is continuous, we can choose M,; so that 
E 
4|2P(t,) — P(t,) — Pt3)| <= 
5 | i( 2) i( 1) i( 3) IN 


whenever m > M,. (Why?) It follows then that p,(f,,f,,) < «/2N whenever n> 
m= M,. Hence p(/,,,f,,) < © whenever n => m2 M where M = max(M,,...,My). 


124 TOPOLOGICAL STRUCTURE 


Thus {f,,} is a Cauchy sequence, but the limit is not in X. By taking the com- 
pletion of X we see that the limit of {f,,} is the Dirac function 69. (69 can be viewed 
as an operator on C[—1,1] where 6,(f) = (0) for fe C[—1,1]. Any function x 
in X can also be viewed in this way. Namely x is a mapping of C[—1,1] into R 
given by x: ff, x(t)f(t) dt. See Section 5.11 or Taylor [2,pp 33-35].) Jj 


Proof of Theorem 3.14.3: Since g: X > Y is uniformly continuous, it follows 
that g preserves Cauchy sequences. (See Exercise 15, Section 13.) Therefore, if 
{x,} 1s a Cauchy sequence in X, then {g(x,)} is a Cauchy sequence in Y. Since Y 
is complete, there is a point y in Y with lim g(x,) = y. If x, > where & € X, we 
then define g(X) = y. We must show that g(x) depends only on X and not on the 
sequence {x,}. This follows from the uniform continuity of g. Let 6(€) be the 
modulus of continuity of g. Let {x,} and {x,’} be two sequences with lim x, = 
lim x, = X, and define y and y’ by lim g(x,) = y and lim g(x,’) = y’. Since 

T(Xp sX pn’) = A(Xp Xn") S A(Xp ,X) + A(K,Xn') > 0, 
we can find a N so that d(x,,x,') < 6(e) whenever n> JN. It follows that 
a(g(X,), 9(X_)) <8. Hence o(y,y’) <6. (See Exercise 4, Section 7.) Since ¢é is 
arbitrary we have o(y,y’) =0, or y= y’. 
It is easy to show that the extension g is continuous, as a matter of fact it is 
uniformly continuous. We omit these details. J 


EXAMPLE 2. Let X be the space C[0,7] with the metric 
T 1/2 
dcxy) ={f Inte) — yor ail 
Then the completion of X is L,[0,7]. Define a mapping K: X >L,[0,T] by 
y = Kx where 
T 
y(t) = k(t,s)x(s) ds (3.14.1) 
0 
and 
T ,T 
[| lk(ts)P dt ds < o. 
0 *0 
It is not difficult to show that K is uniformly continuous. (See Examples 4 and 5 of 
Section 5.) It follows from the last theorem that K has a unique extension to all of 


L,[0,7]. The representation of this extension is given by (3.14.1) where the integral® 
is now the Lebesgue integral. J 


EXERCISES 


1. The following is another proof of Theorem 3.14.2. Let Y denote the collection 
of all Cauchy sequences {x,} from a metric space (X,d). 
(a) Show that o({x,},{x,'}) = lim d(x, ,x,’) exists for {x,} and {x,'} in Y. 


8 One can view the operator K as being defined in terms of the Riemann integral. The extension 
of K would then require the Lebesgue integral. 


3.15. CONTRACTION MAPPINGS 125 


(b) Define a ‘‘ new” equality on Y by saying that {x,} = {x,’} if 
lim d(x,,,X,') = 0. 
Show that o is a metric on Y, in terms of this new equality. 


(c) Show that (Y,c) is complete. 
(d) Find an isometry of X into Y. [Hint: Set x, = x.] 


2. Complete the proof of Theorem 3.14.2 as outlined in the text. 
3. Let X denote the collection of all functions f(z) analytic for |z| <2. Define a 
metric on X by 


1/2 
dia) =|[,..<,f@) -a@)Paxay} 


Show that (X,d) is not complete. What is the completion of (X,d)? 
4. Theorem 3.14.3 can be used for defining functions. For example, find a con- 
tinuous function g(t), 0 < t < 1, not identically zero such that 


g(t + 8) = g(t)g(s). 


z|<1 


(a) Show that g(0) = 1. 
(b) Let g(1) = a and show that g(r) = a" for every rational r. 
(c) Show that g(t) = a’. 
5. Find a continuous function A(t), 1 < ¢, that satisfies 
h(ts) = h(t) + h(s). 


(a) Show that A(1) = 0. 
(b) Show that A(t’) = rh(t) for every rational r. 
(c) Assume that / is not identically zero and show that A(t) > 0 for t > 1. 
(d) Show that A(t) = log,¢ for some b > 1. 

6. Let X denote the space of all sequences of real numbers x = {x,,x,,...} with 
the property that only a finite number of the coordinates x, are nonzero. Let 
d be the sup-metric on XY. Show that (X,d) is not complete. Describe the com- 
pletion of (X,d). 

7. Discuss the completion of a pseudometric space. Show how Exercise | can be 
simplified in this case. What is the analogue of Theorem 3.14.3? 


18. CONTRACTION MAPPINGS 


Given the concept of completeness introduced in the last two sections, we 
have the background for a study of contraction mappings. These mappings are 
extremely important, and they arise in a great number of applications. 


3.15.1 DEFINITION. Let (X,d) be a metric space and f: X > X. We say that 
fis a contraction, or a contraction mapping, if there is a real number k,0<k <1, 
such that 

Af (x), f(y) < kd(x,y) 


for all x and y in X. 


126 TOPOLOGICAL STRUCTURE 


It follows immediately from the definition that a contraction mapping is 
uniformly continuous. The term k is sometimes called a Lipschitz coefficient for f. 
The reason contractions are important lies in the following fixed point theorem. 


3.15.2 THEOREM. (CONTRACTION MAPPING THEOREM.) Let (X,d) be a com- 
plete metric space and let f: X — X be a contraction. Then there is one and only one 
point Xo in X such that 


f (Xo) = Xo- 


Moreover, if x is any point in X and x, is defined inductively by x, = f(x), X2 = 
F(X) 0005 Xp =S(X_—1), then Xp Xo asn— co. (That is, f has a unique fixed point 
Xo and every sequence of iterations of f converges to this fixed point.) 


Proof: Let.us first show that every sequence of iterations of f converges toa 
fixed point. Next we show that fcan have only one fixed point. 
Let x be any point in X and define x, = f(x), x. =/f(x,), and in general, 
Xn =Sf(%n-1). Thus x, = f"(x). We will now show that the sequence {x,} is a 
Cauchy sequence. Assume that n > m, then 


U(Xq 5 Xm) = AF"(x), F(X) = AP (Xn -m)s F"(%)) 
< kd(f""(Xn-m) Sf" *()). 
By induction, we get 
UXn Xm) < k"A(X,—m>X)- (3.15.1) 
Using the triangle inequality, this becomes 
UM Xn Xm) < CU Xp sXn—m—1) + 0° + (Xz x) + A(X,,x)]. 

By applying (3.15.1) we get 

(Xn Xm) < AKI Ee + + 1] d(xy,%). 


Since 0 < k < |, we get 


m 


1—k 


A(XnsXm) < kK" Y ki d(x,,x) = d(x,,X). (3.15.2) 
i=0 
The right side of (3.15.2) can be made arbitrarily small by choosing m (and n) 
sufficiently large. That is, d(x,,x,,) 20 as n,m— oo. Hence {x,} is a Cauchy 
sequence. 
Since the space (X,d) is complete, {x,} converges. Let x9 = lim x,. We now 
assert that xo is a fixed point of f. Since fis continuous, we know that 


lim f(x,) =f(lim x,). 


na nr 


However, f(lim x,) = f(x9) and lim f(x,) = lim x, 4); = Xo, 80 Xp IS a fixed point. 


3.15. CONTRACTION MAPPINGS 127 


To show that the fixed point x9 is unique we argue by contradiction. Assume 
that x, and y, are two distinct fixed points of f. We then get the contradiction 


0 < d(Xo,¥0) = Uf (Xo), f(Vo)) S KA(X0 V0) < AX Yo). 
Hence f has only one fixed point. J 


The fixed point theorem has several extensions. Let us consider one of the 
important ones. 


3.15.3 COROLLARY. Let (X,d) be a complete metric space and let f be a (not 
necessarily continuous) function, f: X > X. If for some integer p> 0 the function 
f? is a contraction, then f has a unique fixed point. 


Proof: Letg=/?. By Theorem 3.15.2, g has a unique fixed point xy. Let us 
show that xo is also a fixed point of f. (Notice that a fixed point of fis also a fixed 
point of g, so f can have at most one fixed point.) 

Since g = f?, it follows that f(g(x)) = g(f(x)) for all x in X. Since g is a con- 
traction, there is a k, 0 < k <1 such that 


A(g(x),g(y)) < kd(x,y) 
for all x and yin X. If f(x9) ¥ Xo we get the contradiction: 


0 < d(f(Xo),X0) = Af (g(%o)),9(X0)) = AGC (Xo)).G(Xo)) 
< kd(f(Xo),X0), < f(Xo),%o). I 


It is possible for fto be discontinuous while the composition f° f is a con- 
traction. For example, let X = [0,1] and define 


, for 0<x<3 
, for #<x<l. 


f(x) = 


Then fo f(x) = } for allO <x <1, so fo fis a contraction. 
Let us now consider some examples of the use of contraction mappings. 


EXAMPLE | (NONLINEAR FILTER). Consider the nonlinear? filter shown in 
Figure 3.15.1. (A can be thought of as a “‘gain factor’’ of the linear element.) 
Assume that the initial conditions for the linear filter are 0. Then the (nonlinear) 
integral equation relating the input u(t) to the output z(f) is 


z(t) =A [ K(s)F(u(s),s) ds, O0<t<T<o. (3.15.3) 


We assume that K(t,s) and F(u,t} are continuous and that F(0,t) =0 for all ¢. 
Further, assume that K and F are bounded, that is, |K(t,s)| < M for’°O0<ts<T 


® Strictly speaking we are tacitly assuming an algebraic structure for discussing linearity or 
nonlinearity. This does not matter, since the precise assumptions of F and K are given below. 
‘© This assumption for K is redundant, see Theorem 3.17.21. 


128 TOPOLOGICAL STRUCTURE 


Time-Varying 
Linear Element 


Nonlinear Element 


Z(t) =ASGK (8, s)y(s)ds 


y(t) = F(u(t), 9 


Figure 3.15.1. 


and |F(u,t)|< N for —o <u<oo and 0<t<T-. Finally, let us assume that 
F satisfies a global Lipschitz condition, that is, there is an L > O such that 


[F(u,t) — Fv,t)| < Llu — o| 


for all u, v, and ¢. 

Let us denote the mapping (3.15.3) of u into z by z= f(u). Then with these 
assumptions, f: C[0,7]— C[0,T], that is, if u is a continuous function on [0,7], 
then z = f(u) is a continuous function on [0,7]. 

Let us now show that the mapping f: C—-> C is continuous. Of course, to do 
this we must put a metric on C; we take the sup-metric d. Now 


FMM — FO! = 


j ‘K(t,s){F(u(s),8) — F(o(s),s)} ds 
< 14 [ 1K(4)1- [FQu(s),8) — FC0(),3)| ds 


< |A| ML i |u(s) — v(s)| ds (3.15.4) 
0 


or 
IF(u)(t) — FO) S [Al ML d(u,v)t. (3.15.5) 
This implies that 
d(f(u), f(v)) < |A| MLT d(u,v). 


The last inequality shows that if |A] < (MLT)7', then fis a contraction. Thus 
when J is sufficiently small, we see that fhas a unique fixed point. Since F(0,t) = 0 
for all t, it follows that u(t) = 0 is the fixed point. 

We can actually show that for all A, u(t) = 0 is the only fixed point of f. We 
do this by applying the last corollary and show that for each A there is a positive 
integer p such that /? is a contraction. 

First we assert that 


Lfr(uy(t) — fr(uy(a)| < 4 ee ul 


d(u,v). (3.15.6) 


We prove this by induction. For p = 1, (3.15.6) reduces to (3.15.5). Now assume 
(3.15.6) is true for p and let us check p + 1. By (3.15.4) we get 


PPP HNC) = FP* MONDIAL ML J 1 °CWN(s) — FOND ds 


3.15. CONTRACTION MAPPINGS 129 
By the induction hypothesis, this becomes 
t(|A| MLs)? 
Pra — f° (ONCOL s lal mfp SE as} acu) 
0 ‘ 


By integrating we then get (3.15.6) for p + 1. 
Now (3.15.6) implies that 


(u,v). 


d(f?(u), f%(v)) < a d 


With A given, we can find an integer p such that 
(Al MELT)" 1 
p! 
Then, for this p, f? is a contraction. Thus for every A, f has a unique fixed point, 
namely u(t) = 0. 

This is a very interesting result, for it tells us that a fairly general class of 
filters has only the null function 0 as a fixed point. In other words, every nontrivial 
input is distorted. Moreover, one can modify this argument to show that the 
equation f(u) = au, where « is a nonzero constant, has only u=0 as a solution. 
In other words, the mapping f has no eigenvectors. J 


EXAMPLE 2. (EXISTENCE AND UNIQUENESS THEOREM FOR SOLUTIONS OF ORDINARY 
DIFFERENTIAL EQUATIONS.) Let us consider the ordinary differential equation 
dy 
at — F(t), (3.15.7) 
t 
where f is a real-valued, continuous function defined on R' x R'. We shall seek 
a solution y(t) for (3.15.7) which satisfies the initial condition 


V(to) = Yo- 


This is said to be the initial value problem. Since f is continuous, it is easily checked 
that a solution of the initial value problem is equivalent to a solution of the integral 
equation 


) = yo + | “FOKS),s) ds. (3.15.8) 


Let us ask: When does (3.15.8) have a unique solution? In order to formulate 
an answer, let us consider the operator z = F( y) where 


2(1) = Yo + f “Fy(s),8) ds. 


F is then a mapping of one space of functions into another. By elementary calculus, 
we see that if y is continuous, then z is continuous. Thus F: @ — @, where @ denotes 
the space of (real-valued) continuous functions defined on some interval J contain- 
Ing fo. 


130 TOPOLOGICAL STRUCTURE 


Now y(f) is a solution of (3.15.8) if and only if y = F(y); that is, if and only 
if y is a fixed point of F. We now ask, under what conditions (on /) is the mapping 
F a contraction? 

(In order to simplify the notation, let us set yo = fp = 0.) 

Since fis continuous, it is bounded on the set -1 < y< 1, —1 <t< 1; that is, 


IfQ,0| s M 


on this set. Now assume that / satisfies a Lipschitz condition in y. In other words, 
assume there is a constant K such that 


fC ¥,t) — f (x,t)| < Kly = x| 


for all’* ¢, x, y satisfying —1 <¢t<1, -l<x<1, -l<y<l. 
With M and K defined above, let X¥ denote the set of all continuous real- 
valued functions $(t) satisfying 


IP(t)| <M |t| 


on the interval /=[—7,T] where 0<7T <1, MT <1 and KT <1. (See Figure 
3.15.2.) X is then a subset of @(J). Moreover, if @(/) has the sup-metric d,,, then X 
is a Closed subset. To prove this, we want to show that the complement is open. 
So let ¢@ be in X’. Then for some 7 in J, |f(t)| > M|t|. Let 2e = |h(t)| — M|t| > 0. 
Now if W is in B,(@), then |y(t)| — M|t| > ¢. Hence w is in X’, so X’ is open. 


A function 
in X 


Figure 3.15.2. 


Since (@,d,,) is complete and X is a closed subspace, it follows that (X,d,,) 
is complete by Theorem 3.13.5. Let us now show that F maps X into X, and that 
F is a contraction. 


11 The expert will realize that we are being needlessly restrictive here. However, our purpose is 
not to prove the most general existence and uniqueness theorem, but rather to indicate an applica- 
tion of the Contraction Mapping Theorem. 


3.15. CONTRACTION MAPPINGS 131 
Let @ € X, then F(@) is in X since 


It] [t] 
F@ OI < | @@), dss | Mds< Mid 


for all ¢ in J. (For this we have only used the fact that fis continuous. The Lipschitz 
condition will be used to show that F is a contraction.) 
Let x and y be in X. Then 


FO) — FOX) = f LFGG)s) — S06.) 


Thus for t > 0 one gets 
FOX) — FOND] < JK x(s) — 916) ds 


t 
< | K d,.(x,y) ds < Kt d..(x,y). 
0 


For t < 0 one gets |F(x)(t) — F(y)(t)| < K|t| d,,(x,y). For |t] < T we get 
d (F(x), F(y)) < KT d,(x,y). 


Since KT < 1, F is a contraction. Therefore, F has a unique fixed point and the 
initial value problem has a unique solution on the interval [—7,7T]. J 


EXAMPLE 3. (CLOSED Loop FEEDBACK SysTeM.) Consider the closed loop 
feedback system illustrated in Figure 3.15.3. The equation for this system is 
determined as follows: Let r, e, and c be real-valued functions of t and F a non- 
linear operator. Then e =r —candée+ F(e)=U+ FPye=r. 


Figure 3.15.3. 


Assume that r, 6, and c are points in the complete metric space BC, where 
BC is made up of all bounded continuous real-valued functions defined on R, 
with the sup-metric d,,. Further, let F be a contraction mapping defined on BC. 
We would like to know several things about the mapping (J + F): 


(1) Is it one-to-one? 

(2) Is the range BC? 

(3) Is it invertible? 

(4) If it is invertible, is the inverse continuous? 


132 TOPOLOGICAL STRUCTURE 


Using the fact that F is a contraction mapping, it is a simple matter to show that 
the answer to each of the above questions is yes. 

Assume that J+ F is not one-to-one. Then there exists ¢, and «, such that 
d.(€1, €2), # O and 


(+ Fie, = (1+ Fie, 
or 
&; — & = Fe, — Fey. (3.15.9) 
However (3.15.9) and the fact that F is a contraction implies that 
0 <d,(&,,82) = d,,(Fe,,Fe2) < d,,(€,€2), 


which is a contradiction. Hence / + F is one-to-one. 
To show that the range of (+ F) is BC, consider the equation 


e=r—Fe=G(e). 


We view the right-hand side as a mapping G, of BC into itself parameterized by r. 
If we can show that G, has a fixed point for each r, we will have shown that the 
range of (1+ F) is BC. Now 


d,.(G,£,, G,€2) = sup [r(t) — (Fe)() — r(t) + (Fe2)(0)| 
t 
= sup |(Fe2)(t) — (Fe,)(t)| = do(Fe1, Fea). 
t 
Since F is a contraction mapping, it follows from the above equation that G, is also 
a contraction mapping for arbitrary r. It follows from the Contraction Mapping 
Theorem that G, has a fixed point for arbitrary r. Therefore @(/ + F) = BC. 


Since J + F is one-to-one and &(I + F) = BC, it is invertible. Hence we can 
meaningfully write 


e=(I+F)'r. 


Finally, let us consider the continuity of (1+ F)~*. Let r, and r, be arbitrary 
inputs and let 


é, =(1+ F)'n, 
é,=(1+ F)'ry. 
Then 
&y ~ 6 =r, — ro — Fe, + Fe. 
This implies that 
Fol E4582) S da (1152) + do( Fes ,Fe2). 


Since F is a contraction mapping, there exists a constant k,0 <k <1, such that 
d,.(Fe,,Fé2) < kd,,(e,,8,). Hence 


1 
d o(€1,62) S ToL d (nr ist2)s 


3.15. CONTRACTION MAPPINGS 133 


which shows that ((+ F)~* is continuous. In fact it is uniformly continuous. 
Moreover, (J + F)~' maps bounded sets into bounded sets. 

Let us define the supremum of the incremental gain of a transformation T of 
a metric space X into itself by 


d,(Tx,Ty 
g(T) = sup Bel TOT) ey and x#y}. 
d (X,Y) 


Then we see that in the present example g(F) < 1 implies that 
JU + FY") <1 —k). 


Of course, the reader familiar with feedback theory will not be surprised that 
I+ Fis ‘‘well-behaved’’ when the “loop gain” is less than unity. J 


EXERCISES 
1. In Example 3, nothing was said about F or (J + F)~’ being causal, asin Section 
2.8. If F is causal as well as a contraction, does it follow that (1+ F)~' is 
causal ? 


2. Let T(iw) denote the Fourier transform transfer function for a system that 
maps L,(— 00,0) into itself. When does T(iw) represent a contraction map- 


ping? 
3. Let f(x,y) be a continuous real-valued function defined on a rectangle 
Bee Xo Keg) IO, Vo), 0; 
and satisfying yop = f(X9,Vo). Assume there is ak, 0 < k < 1, such that 


[f0.y) —floy)| < kly — yl 
for (x,y) and (x,y’) in &. Use the Contraction Mapping Theorem to show that 
there is an « > 0 and a continuous function y = g(x), defined for |x — x)| < «, 
such that g(x) = f(x.g(x)), for |x — x9| < x and yo = g(X%o). 

4. (Continuation of Exercise 3.) Let f(x,y) be continuous on & and satisfy 
Yo =f (X0.¥o). Assume also that of/ey =f, is continuous on & and that 
I,(Xo.¥o) = 0. Show that there is an « > 0 and a continuous function y = g(x), 
defined for |x — x,| < a, such that yp = g(x.) and 


g(x) = f(x,9(x)), |x a Xol Sa. 


5. (Implicit Function Theorem; continuation of Exercises 3 and 4.) Let F(x,y) be 
a C' function on the rectangle 2 with F(x, ,V9) = 0. Assume that 


F(X Yo) F 0. 


Show that there is an « >0 and a continuous function g(x), defined for 
|x — X9| < a, such that yp = g(x9) and 


F(x,g(x))=0, |x —Xxol <@. 
[Hint: Apply Exercise 4 to f(x,y) = y — Fy '(x0.¥0) F(xy).] 


134 TOPOLOGICAL STRUCTURE 


6. 


10. 


16. 


(Inverse Mapping Theorem.) Let x =/f(y) be a C’ function defined for 
ly—yo| < 5b with x9 =f(y_) and f’(y.) #0. Show that there is an x>0 
and a continuous function y=g(x), defined for |x — x | <a, such that 
x = f(g(x)) for |x — xo] < wand Yo = g(Xo). 


. Consider the nonlinear Volterra integral equation 


y(t) = h(t) + [ k(.8) £006) ds, (3.15.10) 


where A(t), k(t,s), and f( y,s) are continuous for 0 < +,0 <s, and all y. Assume 
that there are positive constants A, B, and K such that 


[f(y,5)| < A + Bl yl, 
If(y.8) -f(YS) < Kly — yl. 


Show that there is an x > 0 and acontinuous function y(t) defined for0 <t <a, 
and satisfying (3.15.10). [Hint: Carefully examine Example 2.] 


. The condition that the Lipschitz coefficient & in Definition 3.15.1 satisfy k < 1 


cannot be entirely omitted. Show that 


f(x) = x — te", x <0 


has no fixed points but | f(x) — f(y)| < |x — y| for all x and y in R. [Remark: 
One can show that certain nonexpansive mappings, that is, mappings satisfying 
| f(x) — f(y) || < |x — yl], on Banach spaces have fixed points. See Browder [!].] 


. Is there a Contraction Mapping Theorem for pseudometric spaces? If so, 


what happens to uniqueness ? 
Show that the system 


4 ' 2 
XG = ean a a 8 
_ 4 1 1 
X_ = 4X, + 5X2 + 3x3 — 1, 


es ay I 
X3 = — fx, + 5X2 — 5X3 4+ 2, 


has a unique solution by using the Contraction Mapping Theorem. 


TOTAL BOUNDEDNESS AND APPROXIMATIONS 


Compactness, the subject of the next section, is a property of metric spaces 


and subsets of metric spaces. We shall see that it is a topological property, that Is, 
if the metric spaces X and Y are homeomorphic to one another, then X is compact 
if and only if Y is compact. 


Before considering the concept of compactness itself, let us introduce the 


concept of total boundedness. Recall that a set A is said to be bounded if it has 
finite diameter. 


3.16. TOTAL BOUNDEDNESS AND APPROXIMATIONS 135 


EXAMPLE |. Let d,, be the sup-metric on C[0,7]. The set A, made up all 
functions x in (C[0,7],d,,) such that |x(t)| < 1, is bounded, since diam(A) = 2. 
The set B made up of all x in (C[0,T ],d,.) such that 


T 
| Ix(t)| dt <1 
0 
is not bounded. (Why?) J 


Now let A be a set contained in a metric space (X,d). Suppose that we are given 
an 6 > 0 and that we want to find a distinguished subset of A, call it A,, with the 
property that for each point x in A there is a y in A, such that d(x,y) < «. Figure 
3.16.1 illustrates this idea. 


Open Balls 
~ Of Raditis ¢ 


a 1, is the finite set 
made up of the crosses, + 


- Ai, Shaded Set 


Figure 3.16.1. 


This problem does not become interesting until one places further conditions 
on the distinguished set A,. For example, one may want 4, to be finite. In fact, 
it is very convenient if a finite A, can be found for each ¢>0. Roughly 
speaking, it means that no matter how small ¢ is, a finite, albeit somewhat larger, 
set A, exists. For example, let (X.d) be the real Euclidean plane. Then the bound- 
ed rectangle shown in Figure 3.16.2(a) has this property, whereas the half-plane 
shown in Figure 3.16.2(b) does not. 


(4) (hy) 


hicure $16.2, 


136 TOPOLOGICAL STRUCTURE 


3.16.1 DEFINITION. Let A be a set contained in a metric space (X,d). Given 
é > 0, a subset A, of A is said to be an e-net of A if (1) A, is finite and (2) for each 
x eA there is a ye A, such that d(y,x) <«. 


3.16.2 DEFINITION. A set A in a metric space (X,d) is said to be totally 
bounded if for each e > 0, A contains an e-net. 


For what it is worth we note that the empty set is always totally bounded. 
More important though, we note that any finite set is totally bounded. Of course, 
these examples are also examples of bounded sets. 

However, total boundedness and boundedness are not the same thing, and it 
is important that the difference be understood. To begin with, total boundedness 
is a stronger property than boundedness. 


3.16.3 LEMMA. Let A be a set contained in a metric space (X,d). If A has an 
e-net for some «>, then A is bounded. In particular, every totally bounded set is 
bounded. 


Proof: Let A, be an «-set for A. Then A, contains a finite number of points 

{VV 25-++sVn}- (If A, is empty, the conclusion is obvious.) Now let 
B= max{d()j;,y)): | <i,j <n}, 
which is a finite number. We now claim that 
diam(A) < B + 2. 
Indeed, if x, and x, are any two points in A, then there are two points in A,—call 
them y, and y,—such that 
A(x; Vi) <8, p= 1,2. 
Thus one has 
A(X ,X%2) S A(%1,Y1) + A W1,¥2) + U2 .X2) S Bt 2; 

hence diam(A) < B+ 22, or A is bounded. J 


So total boundedness implies boundedness. However, an extremely important 
point is that boundedness does not imply total boundedness. This 1s illustrated in 
the examples below. 

Before we turn to these examples, we ask the reader to verify the next assertion: 


3.16.4 LEMMA. Let (X,d) be totally bounded. Then X is separable. 


EXAMPLE 2. Consider the space (/, ,d,) (see Example 4, Section 3) and let 
A be the set of all points in (/, ,d,) such that )°°., |x,|7 < 1. The set A is bounded. 


Since 
1/2 


3.16. TOTAL BOUNDEDNESS AND APPROXIMATIONS 137 


by the Minkowski Inequality, we see that diam(A) < 2. Yet the set A 1s not totally 
bounded. Consider the set E = {e,,e,,...} of points in A, where e, = {1,0,0,...}, 
eC, = {0,1,0,...}, e; = {0,0,1,0,...}, and so on. We see that d,(e, ,e;) = 2 fork #j. 
If an e-net A,,. exists for e = 4, there must be an appropriate finite set in A. But the 
closed balls B,,,[e,] and B;,,[e;] are disjoint for k # j. Thus if the set A,,, contains 
u point within distance } of each e,, Ay,;. must have at least one of its points in each 
closed ball B,/.[e,]. It follows that A,,. must be at least countably infinite. There- 
fore, an e-net for e = 4 does not exist, and A is not totally bounded. See Figure 
3.16.3. Jj 


OA 
Q AFA —axte 


Set A 


Figure 3.16.3. 


EXAMPLE 3. Consider the Lebesgue space L,(—00,0oo) with the usual 
metric d,. Let A be the set of all points x in L,(— 0,00) such that 


[ Ix(t)|? dt <1. 


As in Example 2, it can be shown that the set A is bounded but not totally bounded. 
[Hint: Consider the Hermite functions 


W(t) = (2"n!./n) 7H (te? an =0,1,2,... 


where H,(t) is the Hermite polynomial 
Hs) = (—wre"| 5 | n=0, 1,2 
n dt" 3p aes 
(See Section 7.14).|] Jj 


EXAMPLE 4. Consider the metric d, on R", as in Example 2, Section 3, 
und let A be the set in (R", d,) of all points x = (x,,x,,...,x,) such that 


> Il? $l. 
int 


138 TOPOLOGICAL STRUCTURE 


The set A is, of course, bounded. Let us show that A contains an ¢-net A, for each 


é > 0. Let K be a positive integer such that ./n < éK. Let A, be the set of all n-tuples 
(Vy5---Yn) Such that y,;(j = 1,2,...,n) can take only the values m/K (where m 
is an integer with —K <m< K) and 


YP? <1. 
j=l 


The set A, is finite and A, c A. It is apparent that for an arbitrary x = (x,,...,x,) 
in A there is a point y = (),,...,y,,) in A, such that |x; — y,| < 1/K or 


d(x,y) < ip (z) | eile <€, 


j=l 


Hence, A, is an é-net and, « being arbitrary, A is totally bounded. J 


EXAMPLE 5. Return to the metric space (/, ,d,) of Example 2. Let A be the 
Hilbert cube, that is, the set of all points x = {x,} in (/,,d,) such that |x,| < 1/n. 
It follows that 


0O 
lim > |x,|/?=0 — uniformly over A, 
N>o n=N 


that is, given an ¢ > 0 there exists an integer N = N(e) such that 
00 o | € 2 
XP s Xe (5) 


Carefully note that N(é) is independent of the point in A. Choose K to be an integer 
with 2./N < ek. Let A, be the set of points y= {y,} in A such that y, = 0 for 
n > N, and such that the values of y,;, i= 1,..., N, are restricted to the numbers 
m/K (|m| < K), in the spirit of Example 4. It is then a simple matter to argue 
that for an arbitrary x = {x,,x,,...} in A there is a point y in A, such that 
Ix; -y,| < 1/K,i=1,...,.N, and 


st— <2 


~| 2 


see g? 
d7 jy) = Is Syl ye lx = z 


n=] n=N+1 
It follows immediately that A is totally bounded. J 


This argument generalizes, and we have the following result. 


3.16.5 THEOREM. A bounded set A in (1,,d,) is totally bounded if and only 
if for every & > 0 there isa N = N(é) (independent of x) such that 


[e.@] 
y lel Se 
=N 


for all x = {x,,x2,...} in A. 


3.16. TOTAL BOUNDEDNESS AND APPROXIMATIONS 139 


EXAMPLE 6. Let d,, be the sup-metric on C[0,7]. The unit ball 
B= {xeC[0,T]: d,(0,x) < 1} 


is a bounded set but not totally bounded. (We ask the reader to verify this.) We 
will show that given an L > 0, the subset of B given by 


B, = {xe B: |x(t) — x(s)| < L|t — s|} 


is totally bounded. 

Let «> 0 be given. We construct an e-net A, for B, as follows: First choose K 
and N so that N and KLT/N are integers satisfying 3< Ke and 3LT< Ne. Next 
divide the interval [0,7] into N equal parts 0=1t)<t,;<-::<t,=T, where 
t,—t,.; =h=TN~'. Now let A, be the collection of all continuous piecewise 
linear functions p defined on [0, 7] such that 


p(t;)=j/K for some integer /, (jJ=K, and i=0,1,...,N, 


and p is linear between ¢; and ¢;,, with |p(t;.,) — p(t;)| < LA. Clearly A, is finite. 
In fact A, contains no more than (2K + 1)(KLA)™ elements. An illustration is 
given in Figure 3.16.4. 


p, Function in A, 


x, Arbitrary 
Function in B, 


(N-l)h T 
— N 


Figure 3.16.4. 


Let x € B,. We claim that there is a p in A, that satisfies: 
1 
It)- PISe, LsisN. 


Indeed, since |x(t;) — x(t;_,)| < LA < «/3, one then has for t; << t<t;4, 


x(t) — p(t)| S |x(t) — x(t) + x(t) — pal + lpi) — pO! 


E 
<2L|t—t|/+-— 
| I+3 


S2Lh+5<e 


140 TOPOLOGICAL STRUCTURE 


By our choice of K we get d,,(x,p) < &, which shows that A, is an e-net and that 
B, is totally bounded. J 


The reader can probably appreciate that total boundedness 1s a useful concept. 
Indeed the notion of approximating an arbitrary point by a special point from a 
pre-assigned finite collection has applications in many areas including numerical 
analysis. Going beyond this, we will show in the next section that total boundedness 
is intimately connected to the more general concept of compactness. In fact, we 
will eventually show (Theorem 3.17.13) that a metric space is compact if and only 
if it is (1) totally bounded and (2) complete. 


EXERCISES 
1. Let A ¢ R", where R” has the metric 


(x.y) = ¥ Ix; — yl 


Show that A is totally bounded if and only if A is bounded. 


2. (Continuation of Exercise 1.) Find a metric on R" that is equivalent to the 
metric in Exercise 1 and such that there is a bounded set A c R" where A 
is not totally bounded. 


3. Let A, B be sets in a metric space (X,d) with A c B. Show that A is totally 
bounded whenever B is totally bounded. 


4. Let kK(t,S) = i P(t) w(s) where Pi; eee Pn 5) Wi re) Wa belong to L,[0,T] 
where 0 < JT < o. Define y = Kx by 


y(t) = J, k(t,s)x(s) ds. 


Show that K maps L,[0,7] into itself. Let 
A={y: y = Kx for some x with d,(0,x) < 1}, 
where d, is the usual metric on L,[0,7]. Show that A is totally bounded. 


5. Prove Lemma 3.16.4. [Hint: Consider a sequence of ¢é-nets for ¢=2™", 
| ered 

6. Discuss the concept of total boundedness in a pseudometric space. 

7. Prove Theorem 3.16.5. 


The following four exercises should be reconsidered after studying Section 
3.17. 


8. This exercise is the vector version of Example 6. Consider the space X = 
C[J,R"] of continuous functions x(-) defined on J = [0,7] with values in R”. 
Assume that R" has the metric 


d,(x,y) = dix — yl 


3.17. COMPACTNESS 141 


and that X has the metric 
d.(x(-),y(-)) = sup d,(x(t),y(t)). 
Let B and B, be defined by 
B= {x(-)e X: d,,(0,x(-)) < 1} 
B, = {x(-) € B: d,(x(t),x(s)) < Lt — s|}. 
Show that B is not totally bounded. Show that B, is totally bounded. 


9. Let x(t) = e'4x9 denote the solution of the linear differential equation 
x’ = Ax with x(0) = x) € R”. Let d, be defined as in Exercise 8 and let S, 
denote the collection of all such solutions of all such equations subject to the 
conditions 

d,(0,x9) < I; 
max |a; | < K, 
i,J 
where A = (a,;). Show that S, is a totally bounded subset of C(/,R"). (See 
Exercise 8.) [Hint: Observe that if A satisfies (3.16.1), then 


d,(0,Ax) < Kd,(0,x) 


(3.16.1) 


for all x € R".] 
10. Consider the control differential equation 


x’ = Ax + Bu, (3.16.2) 


where x € R", ue R”, and A and B are fixed constant matrices of size n x n 
and n x m, respectively. Let D denote the collection of all solutions x(-) of 
(3.16.2) subject to the conditions: x(0) = 0 and d,(0,u(t)) < 1. Show that D 
is totally bounded in C(U/,R"). (Compare with Exercise 8.) 


11. It is known that if x(t) is a solution of the Gronwall Inequality 
t 
Ix(t)| < a(t) + | Ix(s)| b(s) ds, 120, (3.16.3) 
0 


where a(:) and b(-) are continuous, nonnegative functions, then 
Ix(t)| <a,(re", 
where a,,(t) = maxo.,<, a(s) and B(t) = [6 b(s) ds. 
Let y(t) = Yo + J x(s) ds where x(-) satisfies (3.16.3). Let a(-) come from 
a set A, b(-) from a set B, and let the corresponding y(-) range in a set Y. 
Find conditions on A, Band yo in order that Y be totally bounded in C[0,7T], 
where C[0,7'] has the sup-metric. 


17. COMPACTNESS 


There are at least four equivalent ways of defining compactness in metric 
spaces. The definition we choose is based on the aspect of compactness that is 
used most in applications, namely, sequential compactness. We shall show in this 


142 TOPOLOGICAL STRUCTURE 


section that sequential compactness in a metric space is equivalent to three other 
forms of compactness, Theorem 3.17.13. 

Before we state the definition let us recall the concept of a subsequence. A 
sequence in a set X is a mapping x of the positive integers J* into X. The value of 
x at the point n is denoted by x,. Now let g: J* >J” be a strictly increasing 
mapping, that is, g(m) < g(n) whenever m <n. A subsequence of x: ]* + XY is a 
sequence of the form xog:1* > X where g:I* > /* is strictly increasing. For 
example, let x, = 1/n and g(n) = n?, then x o g(n) = 1/n*. In other words, {1,4,4,...} 
is a subsequence of {1,4,4,...}. In general if {x,,x,,...} 1S any sequence, and 
if {,} are positive integers with n, <n, <..., then {x,, ,x,,,...} 1S a subsequence 
of {x,,X2,...}. 

We shall say that the sequence {x,,x,,...} contains a convergent subsequence 
if at least one of its subsequences is convergent. For example, the sequence {1, 4, 
3,4,5, 4,...} has {1,4,4,4, ...} as a convergent subsequence. Note that the sequence 
itself is not convergent, and it contains many subsequences, such as {1,3,5,7,...}, 
which are not convergent. 

We are now prepared to define sequential compactness. 


3.17.1 DEFINITION. A metric space (X,d) is said to be sequentially compact 
if every sequence in (X,d) contains a convergent subsequence. A set Ac X Is 
said to be sequentially compact if the subspace (A,d) is sequentially compact. This 
means that every sequence in A contains a subsequence that converges to a point 
in A. For example, the set (0,1 ] 1s not sequentially compact in R since the sequence 
{1/n} does not have a subsequence with a limit in (0,1]. 


Roughly speaking, a sequentially compact metric space is so ** crowded ”’ that 
no matter how hard one tries to choose a sequence, an infinite number of the 
elements will always ‘‘ pile up’”’ around at least one point in the metric space. 

The following lemma is an immediate consequence of the Closed Set Theorem. 


3.17.2 LEMMA. Jf A is a sequentially compact set in a metric space (X,d), 
then A is a closed set. 


3.17.3 THEOREM. Let (X,d) be a sequentially compact metric space. A set 
Ac X is sequentially compact if and only if A is closed. 


Proof: It follows from the last lemma that if A is sequentially compact, then 
it is closed. So now assume that A is closed. Let {x,} be a sequence in Ac YX. 
Since X is sequentially compact we can find a subsequence {x,,} with limit xo in 
X. Since A is closed, it follows from the Closed Set Theorem that x) € A. Hence 
A is sequentially compact. J 


It is important to note that we do assume (X,d) to be sequentially compact 
in the last theorem. The theorem is not true otherwise. That is, if (X,d@) is not 
sequentially compact, then A = X ts a closed set that is not sequentially compact. 


3.17. COMPACTNESS 143 


A set A c X is said to have compact closure in (X,d) if the closure of A is a 
sequentially compact set in (X,d). In this case one sometimes says that A 1s rela- 
tively compact or conditionally compact. 

We emphasized in the definition of sequential compactness that the limit of 
(he subsequence had to be in the given set A. Not too surprisingly, when the limit 
of the subsequence exists but is not necessarily in A, we get compact closure. 


3.17.4 LEMMA. Let Ac X where (X,d) is a metric space. The following 
statements are equivalent: 


(a) A has compact closure, that is, A is sequentially compact. 
(b) Every sequence in A has a subsequence that converges (to a point in X). 


Proof: (a)=>(b). If A is sequentially compact, then every sequence in 
A cA has a subsequence that converges to a point in Ac X. 

(b) = (a). Let {x,} be a sequence in A. It follows from the definition of the 
closure A that there is a sequence { y,} in A such that d(x,,y,) < 1/n. From state- 
ment (b) we can choose a subsequence, call it { y,-}, of {y,} such that { y,-} con- 
verges. Say that z = lim y,,. (It is clear that ze A.) Since 


A(Z,X py) < A(Z,Yn’) I A Vn'sXp') ez 0, 


we see that z = lim x,. Hence A is sequentially compact. J 


How is sequential compactness related to some of the other concepts introduced 
previously? More precisely, how is it related to completeness? To total bounded- 
ness? Before we answer this let us prove the following lemma. 


3.17.5 Lemma. Let (X,d) be a sequentially compact metric space and let 
(M,} be a decreasing sequence (that is, M, > M,, 1) of nonempty closed sets. Then 
(\.1 M, is nonempty. 


Proof: Choose x, ¢M, for n=1,2,.... Since {M,} is decreasing one has 
\,€ My for all n> WN and all N. Since (X,d) is sequentially compact, there is a 
convergent subsequence {x,,} with x) =limx,, for some x) € X. Since My is 
closed, it follows from the Closed Set Theorem that x, € My, for every N. Hence 


Vy € (\Na1 My. 

3.17.6 LEMMA. Jf (X,d) is sequentially compact, then it is complete. 

Proof: Let {M,} be any decreasing sequence of nonempty closed sets with 
dliam(M,,) > 0 as n— oo. It follows from Lemma 3.17.5 that (), M, contains 


exactly one point. (Why?) It now follows from Theorem 3.13.11 that (X,d) is 
complete. J 


3.17.7 LEMMA. Jf (X,d) is sequentially compact, then it is totally bounded. 


144 TOPOLOGICAL STRUCTURE 


Proof: We shall prove this by contradiction. If (X,d) does not contain an 
é-net for some ¢ > 0, then we can find a sequence of points {x,} in X with the 
property that d(x,,x,,) => & whenever n# m. But this implies that the sequence 
{x,} contains no convergent subsequence, compare with Exercise 8, Section 6. 
Hence (X,d) is not sequentially compact, and this is a contradiction. J 


One might paraphrase the last two lemmas as follows: Lemma 3.17.6 says 
that a sequentially compact metric space does not have any “‘holes’”’ in it and 
Lemma 3.17.7 says that such a space is “‘cramped”’ or “‘ crowded.” 


3.17.8 THEOREM. A metric space (X,d) is sequentially compact if and only if 
it is totally bounded and complete. 


Proof: Lemmas 3.17.6 and 3.17.7 show that a sequentially compact space is 
totally bounded and complete. Thus we must show here that total boundedness 
and completeness imply sequential compactness. 

Let S, = {x,(1),x,(1),x3(1),...} be an arbitrary sequence in (X,d). We denote 
the sequence with a one in parenthesis to distinguish it from subsequent sequences. 
Since (X,d) is totally bounded, there exists a finite collection of open balls, each 
with radius 2~', that covers (X,d). It follows that at least one of these open balls 
contains a subsequence of S,. Denote this subsequence by S, = {x,(2),x,(2),...}. 
Using the total boundedness of (X,d) again, there exists a subsequence of S, that 
is contained in an open ball of radius 2~*. Denote this subsequence by S3 = 
{x,(3),x,(3),...}. We continue successively forming subsequences of subsequences 
in this manner so that the subsequence S, = {x,(n),x,(n),...} lies in an open ball 
of radius 2~". We thus obtain the following infinite array: 


S,: x,(1), x2(1), x3(1),... 
S,: x,(2), x2(2), x3(2),... 
S3:  X,(3), x2(3), x3(3), ... 


Next let S be the sequence made up of the diagonal entries in this array, that is, 
S = {x,(1),x,(2),x3(3),...}. Owing to our method of construction, S is a sub- 
sequence of S, and S is a Cauchy sequence. (Why?) Since (X,d) is complete, S 
converges. Thus, each sequence in (X,d) contains a convergent subsequence, and 
(X,d) is sequentially compact. J 


There is yet another way of characterizing sequentially compact metric spaces 
which is often useful. 


3.17.9 DEFINITION. A metric space (X,d) is said to possess the Bolzano- 
Weierstrass property if every infinite subset of (X,d) has at least one point of 
accumulation. A set A in (X,d) is said to possess the Bolzano-Weierstrass property 
if the space (A,d) has this property. 


3.17. COMPACTNESS 145 


Note that if X or A is finite, it possesses the Bolzano-Weierstrass property 
because it has no infinite subsets. The intuitive idea behind the Bolzano-Weierstrass 
property is similar to that behind sequential compactness: No matter how hard 
one tries, one cannot select an infinite set that does not “‘ pile up”’ around at least 
one point in the space. Not too surprisingly this property is equivalent to sequential 
compactness. We ask the reader to verify this. 

There is still another form of compactness that is equivalent to sequential 
compactness. This is the so-called Heine-Borel compactness. To state this we need 
the following definition. 


3.17.10 DEFINITION. Let A be a set in a metric space (X,d). A collection of 
sets {M,} in (X,d) is said to be a covering of A if Ac |), M,. A subcollection 
{M,} of a covering {M,} with the property that Ac |), Mg is said to be a 
subcovering of {M,\. Any covering or subcovering made up entirely of open sets is 
said to be an open covering or open subcovering. 


Next the definition of compactness. 


3.17.11 DEFINITION. A metric space (X,d) is said to be compact (Heine- 
Borel compact) if every open covering of (X,d) contains a finite open subcovering. 
A set A in a metric space (X,d) is said to be compact if the metric space (A,d) is 
compact; that is, if each open covering of A contains a finite open subcovering. 


Let us emphasize that we ask that each open covering contains a finite open 
subcovering. We are vot saying that a set A has a finite open covering, or that some 
open coverings have finite subcoverings. For example, the topology 7 of a metric 
space (X,d) is an open covering of (X,d). It contains a finite subcovering; namely, 
the collection consisting of X alone. In fact, any open covering that contains X 
contains a finite open subcovering. The interesting point about compact spaces is 
that even open coverings made up of “‘ very small’? open sets contain finite sub- 
coverings. 

This version of compactness is probably the hardest to understand. However, 
it is this version that is really the most fundamental. The reason for this is that 
Heine-Borel compactness can easily be generalized to topological spaces that are 
not metrizable. 

For our purposes we have the following equivalence. 


3.17.12 THEOREM. A metric space (X,d) is sequentially compact if and only 
if it is compact. 


We shall outline a proof of this theorem in the exercises. It should be noted 
that it is not important, for the purpose of the book, that the reader master this 
proof. We shall only use this result as an excuse to use the phrase “‘ compact”’ in 
place of ‘‘sequentially compact.” All of our proofs will be based on the concept 
of sequential compactness and Theorem 3.17.8. 


146 TOPOLOGICAL STRUCTURE 


We have given four versions of compactness, all of them equivalent in metric 
spaces. Let us summarize this. 


3.17.13 THEOREM. (COMPACTNESS THEOREM.) Let (X,d) be a metric space. 
Then the following statements are equivalent: 


(a) (X,d) is compact. 

(b) (X,d) is sequentially compact. 

(c) (X,d) is complete and totally bounded. 

(d) (X,d) possesses the Bolzano-Weierstrass property. 


EXAMPLE |. Let R be the real line with the usual metric, and let Ac R. 
When is A compact? The following theorem, originally proved by Heine and 
Borel, characterizes compact subsets of R. 


3.17.14 THEOREM. A set ACR is compact if and only if it is closed and 
bounded. 


Proof: If A is compact, then it is closed and totally bounded (Lemma 
3.17.2 and Theorem 3.17.8). However, every totally bounded set is bounded 


(Lemma 3.16.3). 
Now assume that A is closed and bounded. Since R is complete, it follows that 


A is complete (Theorem 3.13.5). Since A is bounded, it is totally bounded. (Why ?) 
Therefore A is compact (Theorem 3.17.13). J 


Is compactness a topological property ? How does it behave under continuous 
mappings? Let us now look into these questions. 


3.17.15 THEOREM. Let f: X— Y be a continuous function, where (X,d) and 
(Y,c) are metric spaces. If (X,d) is compact, then the range f(X) is a compact set 
in (Y,o). 


Proof: Let {y,} be a sequence in the range f(X). Then there are corre- 
sponding points {x,} in X with y, = /f(x,). Since (X,d) is compact we can find a 
subsequence of {x,} that converges in X, say that x,, > x. Since fis continuous 
one has f(x,,) ~ f(x) in f(X) by Theorem 3.7.2. Hence {y,} has a convergent 
subsequence and f(X) is compact. J 


Since compactness is preserved under continuous mappings, it is obviously 
preserved by homeomorphisms. 


3.17.16 COROLLARY. Let (X,d) and (Y,c) be homeomorphic metric spaces. 
Then (X,d) is compact if and only if (Y,a) is compact. 


3.17. COMPACTNESS 147 


So compactness is a topological property, and compactness is equivalent to 
(otal boundedness with completeness. But wait! We have very carefully pointed 
out that completeness is not a topological property (Section 13). It also happens 
that total boundedness is not a topological property. For example, let X = (0,1] 
und Y=[l,oo) where X and Y are equipped with the usual metric. Then the 
mapping y = 1/x is a homeomorphism of X onto Y. But (X,d) is totally bounded 
und (Y,d) is not. Thus, completeness and total boundedness separately are not 
topological properties, whereas taken together they are! 

We are often interested in products of compact metric spaces. Fortunately, 
products do not offer any problems. 


3.17.17 THEOREM. Let (X,,d,) and (X,,d,) be compact metric spaces. Then 
the product space (X,d), where X = X, x X, and d(x,y) = d,(x1,y,) +4_(X2,y2), 
is compact. 


Proof: Let {x,} be an arbitrary sequence in (X,d). Then x, = (x,",x,”), 
where {x,"} and {x,"} are sequences in (Xj,d,) and (X,,d,), respectively. Since 
(X,,d,) is sequentially compact, the sequence {x,"} contains a convergent sub- 
sequence {x,"‘}. Similarly, the corresponding sequence {x,”"'} in (X, ,d,) contains 
i. convergent subsequence {x,”"'/}. Since any subsequence of a convergent sequence 
is convergent, the subsequence {x,"4} taken from {x,"'} is convergent. It follows 
that the sequence (x,"',x,") in (X,d) is convergent. Hence, each sequence in 
(X, d) contains a convergent subsequence, and (X,d) is compact. J 


3.17.18 COROLLARY. Let (X,,d,) and (X,,d,) be compact metric spaces. 
Then the metric space (X,d'), where X = X, x X, with d' any metric equivalent 
to A(x, y) = d,(X1,¥1) + d2(X2,y2), is compact. 


The proof of this corollary follows from the last theorem and the fact that 
compactness is a topological property. 


3.17.19 COROLLARY. Let (X,,d,),...,(X,,d,) be compact metric spaces. 
Then the metric space (X,d), where X = X, x X, x ++: x X, and 


d(x,y) = d,(X1,)1) a a lea d(Xn Vn) 


or any equivalent metric, is compact. 


A simple, though important, consequence of the above is the following charac- 
terization of compact sets in R”, or C”. We ask the reader to prove this result. 


3.17.20 THEOREM. Let R" (or C") be given with the metric 
d(x,y) = en Ix; — yil. 
Then a set A < R" (or C") is compact if and only if it is closed and bounded. 


148 TOPOLOGICAL STRUCTURE 
A useful observation concerns real-valued functions defined on compact sets. 


3.17.21 THEOREM. Let f be a continuous real-valued function defined on a 
compact metric space (X,d). Then f is bounded, that is 


M =sup{f(x): xe X} 
and 


m = inf{ f(x): xe X} 


are finite. Moreover, there are pointS Xi, ANd Xma, in X such that f(Xmax) = M 
and f(Xmin) =m. 


Proof: It follows from Theorem 3.17.15 that f(X) is a compact set in R, 
and Theorem 3.17.14 implies that f(Y) is closed and bounded. Therefore M and 
m are finite. Since CX) is closed, it is clear that M and m are in f(X). Hence there 
are points x,,,, and x,,;, in X such that f(x,,,,) = Mand f(x,pi,) =m. J 


EXAMPLE 2. (ARZELA-ASCOLI’S THEOREM.) Let (X,d,) be a compact metric 
space and let (Y,d,) be a complete metric space. Form the space C = C(X,Y) 
of continuous functions defined on X with range in Y. If fg € C, define a metric 
p by 


p( fg) = sup{d,(f(x), g(x)): x € X}. 
It follows from Theorems 3.17.17 and 3.17.21, and the fact that 


d,(f(-), 9-)): XR 


is continuous, that p(f,g) is finite for all fand g. 

A typical example occurs when X = [a,b] is an interval, Y = R, the reals, and 
d, and d, are the usual metric. In this case (C,p) becomes the space C[a,b] with 
the sup-metric. 

Let us now show that (C,p) is a complete metric space. This is done as follows: 
Let {f,} be a Cauchy sequence in (C,p). Since d,(f,(x),f,,(x)) < pU,.f,,) for 
each x in X, we see that {f,(x)} is a Cauchy sequence in (Y,d,) for each x in X. 
Since (Y,d,) is complete this means that the function f defined by 


Ax) = lim f,(x) 


maps X into Y. Now let us show that fis continuous. Let e>0 be given and 
choose an integer N so that p(/,,f,,) <6 whenever n,m>WN. Let xe X. Since 
fy iS continuous we can find a 6>0 such that d,(fy(x), fy(%’)) < &, whenever 
d,(x,x’) < 6. Now choose n > N and x’ with d,(x,x’) < 6. One then has 
d,(f(x), F(X’) S F(X), ful) + da(fa(X), fr) + d2(fn(), Sy’) 
+ da(fx(X'), fr) + 2( F(X), F(X’) 
< 4,(f (x), fx(x)) + 38 + a2(fi(x’), f(x’). 


3.17. COMPACTNESS 149 


By taking the limit as n > o0 one gets 


dy(f(x), f(x’) < 3¢ 


whenever d,(x,x’) < 6. Hence f is continuous. 
Finally we must show that p(f,, ) 70 as n— oo. However, since {f,} is a 
Cauchy sequence, for every ¢ > 0 there is a N such that 


dS) Sn(*)) S PS» Sm) Se, for allxe X 


and all n, m > N. If we let m— oo, then the above statement becomes 
do(f,(x), f(x)) Se, forallxe X 


und all n > N; that is, p(f,, f) < € whenever n > N. Hence p(f,, f) >0asn- oo. 

Let A be a collection of functions from C. We seek conditions on A that will 
ensure that the closure A is a compact set in (C,p). For this we need two defini- 
lions: 


3.17.22 DEFINITION. A family of functions A in C is said to be pointwise 
compact if for each x € X the set { f(x): fe A} has compact closure in (Y,d,). 


3.17.23 DEFINITION. A family of functions A in C is said to be equi-continuous 
if for each xe X and ¢ > 0 there is a 6 > O such that d,(/(x), f(x’)) < « for every 
fin A, whenever d,(x,x’) < 6. [Note: 6 depends on éand x but not on f. Hence the 
(erm “‘equi-continuity.’’] If 6 can be chosen independent of x as well, the family 
is said to be uniformly equi-continuous. 


3.17.24 THEOREM. (ARZELA-ASCOLI.) Let A be a set in (C,p). Then the 
following statements are equivalent: 


(a) The closure A is compact. 
(b) The family A is pointwise compact and equi-continuous. 


Proof: (a)=(b). In order to show that A is pointwise compact we fix x 
und choose a sequence {y,} in { f(x): fe A}. This means that y, = f,(x) for some 
/,€ Ac A. Since A is compact, this means that we can find a subsequence {f,,,} 
(hat converges in (C, p), say that f,, > fas n' > oo. If we let y = f(x), then we see 
that y,- > yas n’ — oo. Hence A is pointwise compact. (The same argument shows 
(hat A is pointwise compact.) 

Next we shall show that A is equi-continuous. Since A is compact, it is totally 
bounded, by Theorem 3.17.8. Let {fj,...,fy} be an e-net for A. If fis any point 
in A then there is an f; in the e-net such that 


d,( f(x), f(x) < offi) S €. 


It follows that 
df (Xo), f(%')) S 2e + d2(fi(Xo), Fi(’)). (3.17.1) 


150 TOPOLOGICAL STRUCTURE 


(Why?) Since each of the functions {f,,...,fy} 1s continuous at x,., this means 
that we can find a 6 = 6(x,,€) > 0 such that 


d,(fi(Xo),fAx’)) < (1 <i<WN), (3.17.2) 


whenever d,(Xo,x’) < 6. By combining (3.17.1) and (3.17.2), we see that A is 
equi-continuous. Hence A is equi-continuous. 

(b) = (a). Since XY is compact, we know that it is separable. (Theorem 3.17.8 
and Lemma 3.16.4.) Let D = {x,} be a countable dense set in X. We shall use 
Lemma 3.17.4 to show that A is compact. Let {f,} be a sequence in A. Since 
{ f,(x,)} lies in a compact set in Y we can find a convergent subsequence, which we 
shall denote by {f,"}. Since {f,(x,)} lies in a compact set in Y, we can find a 
convergent subsequence, which we shall denote by { /,‘”}. Continuing with x3, x4, 
and so on, we construct a family 


ele ea ei ae 


of sequences, each a subsequence of the preceding, and with the property that 
{ f,(x;)} converges for 1 <i<k. It then follows that the diagonal sequence 


ue a ed : J 


has the property that {f,“(x;)} converges for all x; in D. Let f(x;) = lim f,(x,), 
for x; € D. 

To complete the proof we shall show three things: (i) {f,“(x)} converges for 
each x in X. (We shall let f(x) =lim,.,, f,(x).) (ii) The limit function f is 
continuous. (iii) p(f,,f) 2 0 as n> 00. 

In order to show that { f,“"(x)} converges for each x in X, we shall show that 
this sequence is a Cauchy sequence in Y. Let e > 0 be given and choose 6 > 0 so 
that d,(f,°(x), f,(x’)) < e, for all n, whenever d,(x,x’) < 56. Now choose x; in 
D so that d,(x,x;) < 6. Since {f,(x,)} is a convergent sequence it is a Cauchy 
sequence. Hence, there is a N such that d,(f,“"(x;), fn” (X;)) < € whenever n, m > N., 
It follows that 


Dafa (X) Sm (2) S Afro), fn) + Sn) Fin (9) 
+ dy fin (Xi) Sm (X)) < 3e 
whenever n,m > N. Hence {f,‘(x)} is a Cauchy sequence. Since Y is complete 


we see that f(x) = lim,..,,. f°"(x) is defined for all x in X. 
The continuity of f follows from the equi-continuity, since 


d,( f(x), f(x’) = lim da(f,(>), fx’). (3.17.3) 


That is if e > 0 is given, we choose 6 > 0 so that by (3.17.3) and the equi-continuity 
of A we have d,( f(x), f(x’)) < ¢, whenever d,(x,x’) < 0. 

The main step in showing that p(/,“,f) ~0 is to note that if {x,} is any 
sequence in X with lim x, = Xo, then lim f,“"(x,) = f(%o). Indeed one does have 


da fa(%n), S(%0)) < da fan) fa (0) + da fn (Xo). (Xo) 
Set Aa(fn'(x0) (Xo) 


3.17. COMPACTNESS 15] 


provided d,(x,,X%9) <6, where 6 is given by the equi-continuity of A. By taking 
the limit as n > 00 we get 


lim sup d(f,"(%,),f(%o)) < & 


n> 0 


Since ¢ isarbitrary we havelim f,“(x,) = (Xo). The final step, namely, p(f,,,f) > 0, 
is now an easy exercise, see Exercise 15. J 


EXAMPLE 3. Let u(t) bethe downward thrust of a rocket of mass m. Suppose 
that x(t) denotes the altitude of the rocket. Assume that u(t) => 0 and that the 
maximum thrust available is M > 0, that is, u(t) < M. Let the time interval over 
which the rocket burns be [0,7], and assume that u(t) is an element of C[0,7']. 


mi 


x(t) Juco) 


Figure 3.17.1. 


Further, assume that changes in thrust are limited as follows: 
|u(t) — u(s)| < Vi[t—s|, for all ¢, se [0,7], 


where V > 0. Suppose the mathematical model for this system is given by 


2 


mM——74fs>=u-— ing, 
dt? g 


where 


(i) g is the gravitational constant. 
(ii) mg < M. 
(iii) u(O) = mg. 


; dx 
(iv) x(0) =0 and Ho = 0. 


dx 
(v) a) = 0. 


One optimum control problem would be to select an input u so that x(T) is maxi- 
mized. Many techniques from optimum control theory are so-called indirect 
methods which start assuming that an optimum solution exists. For example, we 


152 TOPOLOGICAL STRUCTURE 


might start by assuming u* to be the optimum input and then build on this assump- 
tion. It is important, then, to be able to show that u* does indeed exist. By using 
compactness, one can show that an optimum input u* exists. Use Theorems 
3.17.21 and 3.17.24. (Also compare with Example 6, Section 16.) J 


EXAMPLE 4. Let Q be a bounded open set in R” and consider the space 
L AQ), 1 < p < ©, with the metric given by 


1/p 
d,(f.9) = | 17) — aoa} 


Let A = {f} denote a collection of functions from L,(Q) and assume that each 
function fe A is continuous on Q. We claim that if A is pointwise compact (on 
©) and equi-continuous (on Q), then the closure A is compact in L,(Q). The proof 
of this assertion is not difficult. It follows easily from the Arzela-Ascoli Theorem. 
Indeed, if { f,} is any sequence in A, then the Arzela-Ascoli Theorem assures us that 
there is a subsequence that converges uniformly on Q, say that f=lim/,,. It 
remains only to show that this subsequence {f,,} converges to fin the metric d, 
on L,(Q). However, 


dL fan)? = [olf ) — fy DI? x 
< sup{if(x) —f,.@0?: x €B} - 1Q], 


where |Q| denotes the Lebesgue measure of Q, which is finite. One then has f,, > / 
in (L,,d,). fl 


EXAMPLE 5. It is possible to characterize conditionally compact subsets of 
the space L,(— 00,00) with the usual metric d, when 1 <p < oo. Specifically a 
set A in L,(— 00,00) has compact closure if and only if 


(i) A is bounded, that is, there is a real number B such that [%,,|x(t)|’dt < B 
for all x(-) in A, 
(ii) limp, {f7 Zl? dt + JPlx()|/? dt} =0 — uniformly over A, and 
(iii) lim, J? .|x(t + 1) — x(d|? dt =0 uniformly over A. 


We will not prove this assertion here but instead we refer the reader to Dunford 
and Schwartz [i, pp. 298-301]. Jj 


EXERCISES 


1. Show that in the metric space (/, ,d,) a set A has compact closure if and only 
if the following two conditions are satisfied: 
(a) A is bounded. 
(b) limysa ren |Xal? =0 uniformly over A. 
[Hint: See Theorem 3.16.5.] 


9, 


3.17. COMPACTNESS 153 


. What happens to Example 3 if we remove the restriction 


u(t) — u(s)| < Vit — s|2 


. Let d,, be the sup-metric on C[0,7]. Suppose that a set of output signals from 


a system (see Figure 3.17.2) are modeled by a set A < C[0,T], where A is 
defined as the collection of all x in C[0,7] such that 


|x(t)| < 1, for all te [0,7], 
|x(t,) — x(t2)| < Vit, — tal, for t,, t, € [0,7], 


with V > 0. Assume that the output s of the system is not observed directly 
and that y, the output signal corrupted by additive noise n € C[0,7T], is observed. 


Further assume that 


Noise n 


Output s Observation y=stn 


Figure 3.17.2. 


1 
|n(t)| < 10 for all te [0,T]. 


A set C of outputs, C < A, will be said to be distinguishable if d(c,,c,) > 2/10 
for all cy, cz € C such that c, # c2 . That is, the set of closed balls { By), 9Lc]| ce C} 
are pairwise disjoint. Using compactness, show that C can contain at most a 
finite number of output signals. 


. Discuss the concept of sequential compactness in a pseudometric space. 


Is the analog of Theorem 3.17.8 valid in this setting? What about Theorem 
3.17.13? 


. Let A be a finite set in a metric space (X,d). Show that A is compact. 
. Let p(x,y) =0 be the trivial pseudometric on X. Characterize the (Heine- 


Borel) compact subsets of X. 


. Let d(x,y) = 1(x# y), d(x,x) = 0 bea metric on X. Characterize the compact 


subsets of X. 


. Choose sequences {a,} and {b,} in R so that a,,<a, <b, < b, whenever 


m <n. What is the set ()*_, J,, where I, = [a,,b,]? 

Let f: X > Y be a one-to-one mapping of X¥ onto Y. Assume that (X,d) and 
(Y,o) are metric spaces and that (X,d) is compact. Assume that fis continuous. 
Show that f~' is continuous, What happens if (X,d) is not compact? 


154 TOPOLOGICAL STRUCTURE 


10. 


Il. 


12; 


A metric space (X,d) is said to be locally compact if every point x in X has 
compact local neighborhood. Which of the following spaces are locally com- 
pact? 

(a) R" with d(x,y) = Dif. [xi — yi 

(b) R with usual metric. 

(c) (4,d)). 

(d) (1, dp), I <p <0. 

(e) (X,d) where d(x,y) = 1 (x ¥ y) and d(x,x) = 0. 


(Continuation of Exercise 7, Section 5.) Let B* be given by 
Be = {feC dW: d(0,f) < 1}. 


Show that B* is a compact set in C,(/) where 0 < B <a < 1. [Hint: Apply the 
Arzela-Ascoli Theorem to the functions 


f(t) — f(s)| 
t,s) =, 
where fe B*.] 
(Continuation of Exercise 7, Section 15.) Consider the family of Volterra 


integral equations 
t 
x(t) = hy() + | k(ts)f(%(98) ds, 2 =1,2,..., (3.17.4) 
0 


for 0 < t, s < 1. Assume that there are positive constants H, M, A, B, and K, 
such that 


|A,(t)| < H, |k,(t,5)| < M, |f,(x,5)| < A + Blx| 


for allO0 <¢, 5 <1, and all x, and 


Fn X,5) — fil y.5)| < K, |x a y| 


for all 0<s <1, and all x, y. Assume further that A,,k,, and f, are con- 

tinuous. 

(a) Show that there are constants « > 0 and D> 0 (independent of ») such 
that the (unique) solution x,(t) of (3.17.4) is defined for 0<t<a and 
satisfies |x,(t)| < DforO<t<a. 

(b) Show that the sequence {x,} contains a uniformly convergent subsequence 
on0<t<za. [Hint: Apply the Arzela-Ascoli Theorem.] 

(c) Assume that h,->h, k, > k, and f, ~/f where the convergence is uniform. 
Assume also that x, x uniformly for0<t<«a. Show that f,(x(t),t) 
Ff (x(t),t) uniformly for 0 < t < a. Show that x(£) satisfies 


x(1) = h(t) + f k(4,3)f((8),s) ds, 


for 0<t<za. [Note: The limit function f need not be Lipschitz con- 
tinuous. ] 


24. 


3.17. COMPACTNESS 155 


. Show that a continuous function defined on a compact metric space is uni- 


formly continuous. 


. Use the result of Exercise 13 to show that ‘‘equi-continuity”’ and “uniform 


equi-continuity’”’ are equivalent on compact metric spaces. 


. In Example 2, show that a sequence {f,} in (C,p) satisfies p(f,,f) 0 as 


n— oo if and only if lim f,(x,) =/(lim x,) for every convergent sequence 
{x,} in X. 


. In Example 2, show that a family A in (C,p) is equi-continuous if and only 


if its closure A is equi-continuous. Do the same for pointwise compactness. 


. Let he C[0,1] and define g,(t) = J§ cos*(t + nu) h(u)du. Show that {g,} has a 


subsequence that converges uniformly for0 <t<l. 


. Let (X,d) be a complete metric space and A < X. Show that the closure A is 


compact if and only if A is totally bounded. 


. Let A < X where (X,d) is a metric space. Show that the closure A is compact 


if and only if every sequence in A has a convergent subsequence. [Note: We 
do not ask that the limit of the convergent subsequence be in A.] 


. Complete the argument in Example 3. 
. Prove Theorem 3.17.20. 


. Show that sequential compactness is equivalent to the Bolzano-Weierstrass 


property. 


. The following steps will lead to a proof of the assertion that sequential com- 


pactness is equivalent to (Heine-Borel) compactness: 

(a) Show that if (X,d) is compact, then it is sequentially compact. [Hint: 
Let {x,} be a sequence in X. Show that for every ¢ > 0 there is an open 
ball B of radius ¢ that contains an infinite number of the {x,}.] 

(b) Show that if (X,d) is sequentially compact, then it is separable. (Use 
Theorem 3.17.7.) 

(c) Show that if (X,d) is separable, then every open covering of (X,d) contains 
a countable subcovering. 

(d) Show that if (X,d) is sequentially compact, then it is compact. [Hint: Use 
Step (c) and argue by contradiction.] 


(Extension of Arzela-Ascoli Theorem.) Let (X,d,) be a compact metric space 
and let (Y,d,) be a complete metric space. Let A = {f,, f,,...} be a sequence 
of continuous functions with the following properties: 

(a) A is pointwise compact. 

(b) There is a sequence of positive numbers {¢,} such that ¢,->0 as n—- oo. 
(c) For each xe X and every e > 0 there is a 6 > 0 such that 


A (fx) f(x) <et+e,,  foralln 


whenever d,(x,x') < 0. 
Show that A has compact closure in (C,p). Show that A is equi-continuous. 


156 TOPOLOGICAL STRUCTURE 


25. 


26. 


21. 


28, 


29. 


30. 


31. 


32. 


Use Exercise 24 to solve the following problem: Let {g,} be a sequence of 
integrable (possibly unbounded) functions defined for 0 < ¢ < 1. Assume that 
there is a bounded function g with the property that /§ |g,(t) — g(t)| dt = ¢, + 0 
as n—> oo. Let 


t 
yp(t) = J gals) ds, O<t<1. 


(a) Show that {y,} has a subsequence that converges uniformly forO <t< 1. 
(b) What is the limit of this subsequence? 

(c) Is it true that lim y, exists? 

A sequence { f,} of real-valued continuous functions defined for —co <t< 
is said to converge uniformly on compact sets to a limit fif for every compact 
set K c R one has 


sup f(t) —f(O)| > 0 


as n— oo. Show that { f,} converges to fin this sense if and only if d(f,,f) - 0 
where d is the metric 


dfg) = ¥ 2" min(1, max [/(0 ~ 9(0I). 


Show that the Arzela-Ascoli Theorem extends to the space (C(—.00,0),d) 
where the metric d is defined in Exercise 26. [Note: This exercise is not a direct 
application of Theorem 3.17.24 because the domain (— 00,00) is not compact.] 


(Continuation of Exercise 27.) Let fe C(— 0,0) and let A denote the collec- 
tion of all translates f, of f where f(t) = f(t + ¢t). Show that A has compact 
closure if and only if fis bounded and uniformly continuous. 


Let (X,d) be a complete metric space, and let ¢. > 0. For 0<eé< &, let 4, 
denote a subset of X. Assume that for 0 < ¢ < &, A, 1s totally bounded in_Y. 
Also assume that for 0 <«é < & , there is a mapping ®,: A, > A, such that 
d(x,®, x) < é€ for all x € A,. Show that Ag is compact. [Note: It is not necessary 
to assume that ®, is continuous.] 


Let (X,d) be the metric space QO made up of all the rational numbers with the 
usual absolute value metric. Show that a subset A of (X,d) is compact if and 
only if it is bounded and closed when considered as a subset of the real numbers. 
Hence the set A = {0,1/2,1/3,1/4,...,1/k,...} 1s compact in Q. On the othe: 
hand, the subset B = {3,3.1,3.14,3.141,...} is not compact in Q. 

Use the results of this section to reconsider 

(a) Exercise 8, Section 16. 

(b) Exercise 9, Section 16. 

(c) Exercise 10, Section 16. 

(d) Exercise 11, Section 16. 

Let {B,} be a sequence of nonempty compact sets in a metric space (X,d). 
Assume that {B,} is decreasing, that is, B,,, <B,. (a) Show that ();" , B, is 
nonempty and compact. (b) What happens if the sets B, are not compact? 


3.17. COMPACTNESS 157 


SUGGESTED REFERENCES 


Bartle [1]. Lee and Markus [1]. 
Boas [1]. Maak [1]. 

Ilardy, Littlewood, and Polya [1]. Royden [1]. 

Hewitt [1]. Rudin [1]. 

Kelley [1]. Simmons [1]. 


Kolmogorov and Fomin [1]. 


Algebraic 
Structure 


l. 


Introduction 


Part A Introduction to Linear Spaces 


~ 


9. 


Linear Spaces and Linear Subspaces 


Linear Transformations 
Inverse Transformations 
Isomorphisms 


Linear Independence and 
Dependence 


Hamel Bases and Dimension 


The Use of Matrices to Represent 
Linear Transformations 


Equivalent Linear Transformations 


Part B- Further Topics 


10. 
Lk 
12. 


Direct Sums and Sums 
Projections 


Linear Functionals and the 
Algebraic Conjugate of a Linear 
Space 


Transpose of a Linear 
Transformation 


160 
161 
161 
165 
17] 
173 


176 
183 


188 
192 


196 


196 


201 


204 


208 


1, INTRODUCTION 


The study of algebraic structures is certainly one of the oldest endeavors in 
the world of mathematics. For example, with our current viewpoint many of the 
ancient problems of mathematics, such as the ruler-compass constructions of 
Euclidean geometry, are seen to be algebraic problems in disguise. However, it 
was not until rather recently that mathematicians began to observe the important 
unifying role of the concept of linear spaces. It is the algebraic structure of these 
spaces that interests us in this chapter. 

In this chapter we leave aside topological considerations. Here, words and 
phrases such as limit, Cauchy sequence, open set, closed set, completeness, com- 
pact, closure, dense, isometry, homeomorphism, and, above all, continuity and 
convergence are suppressed. Recall that in Chapter 1 we divided the structure 
of the real Euclidean plane into three categories: set-theoretic, topological, and 
algebraic structure. In Chapter 2 we reviewed set-theoretic structure. In Chapter 
3 we studied spaces with topological structure only, that is, metric and pseudo- 
metric spaces. Now we turn our attention to mathematical systems that have 
algebraic structure only, namely, linear spaces or, as sometimes referred to, 
‘vector spaces.” 

We should note that there are other algebraic structures such as groups, rings, 
Boolean algebras, and so on. We will not discuss these structures here since they 
are not germane to our primary objective. 


160 


Part A 


introduction 
to Linear Spaces 


2. LINEAR SPACES AND LINEAR SUBSPACES 


A linear space consists of a set (the underlying set), a scalar field, and some 
structure. For present purposes the scalar field is always either the real numbers 
R or the complex numbers C. In cases where we do not wish to distinguish between 
R and C, we simply refer to the scalar field F. The structure of a linear space is 
based on operations of addition and scalar multiplication. 


4.2.1 DEFINITION. A linear space over a scalar field F is a nonempty set 
X and 


(1) A mapping of X x X into X, called addition and written x, + x, 
(2) A mapping of F x X into X, called scalar multiplication and written ax. 


Addition and scalar multiplication must satisfy the following conditions: 


(Al) x, + x, = x2 +, for all x,, x, in X. 

(A2) x, + (x2 + X3) = (%, + x2) + x3 for all x,, x., x3 in X. 

(A3) There exists a (unique) element in X denoted by 0 and called the 
origin, such that 0 + x = x for every x in X. 

(A4) Associated with each x in X is a (unique) point —x in X such that 
x+(-x) =0. 

(SM1) a(Bx) = (aB)x for all «, Be F and all xe X. 

(SM2) Ix =x for all xe X. 

(SM3) 0x = 0 for all xe X. 

(A & SM1) a(x, + x2) = ax, + ox, for all «ae F and all x,, x, € X. 

(A & SM2) (a + B)x = ax + Bx for all «, B e Fand all xe X. 


The somewhat strange numbering of the above conditions is employed to call 
attention to the nature of the conditions: (Al) for conditions on the addition 
operation, (SM1) for conditions on the scalar multiplication operation, and 
(A & SM1) for conditions on addition and scalar multiplication jointly. 

It is not claimed that the above conditions are logically independent. For 
example, let « = 1 and f = 0 in (A & SM2). Since 1 + 0 = 1 in F, it follows that 
lx = lx + Ox. From (SM2) lx = x; therefore, x = x + Ox for all x. But from (A3), 


161 


162 ALGEBRAIC STRUCTURE 


the origin is the only point such that x + y = x forall x. Therefore, 0x = 0. Thus we 
have shown that (A & SM2), (SM2), and (A3) imply (SM3). We define linear 
spaces in terms of the above (logically dependent) set of conditions for the simple 
reason that this form of definition yields quick intuitive insight into the nature of 
linear spaces. 

There are certain important conclusions which follow from the above condi- 
tions. Theseare: —x = (—1)x;20 =0;-—0=0;y +[x + (—y)] =x3;0x =0>a=0 
orx=O(orboth);x+y=x+7z>y=2Z;0x =ayfora#0>x = y; and ax = Bx 
forx40>a= B. 

Sometimes we refer to a linear space over the scalar field R as a real linear 
space, and a linear space over C as a complex linear space. Again, where no distinc- 
tion is made we refer to the scalar field F. We shall consistently denote a linear 
space by the symbol denoting its underlying set, for example, X. One could write 
(X, F, +, °) to denote a linear space, but this would be far too cumbersome. 

A subset Y of a linear space X is said to be a linear subspace if x, +x,¢€ Y 
whenever x,,x,¢ Y and axe Y whenever ae F and xe Y. The reader should 
verify that a linear subspace is itself a linear space, that is, the nine properties 
defining a linear space are satisfied. One should not confuse the concept of a 
linear subspace with the concept of a subspace of a metric space. 

Next let us consider some examples. Not too surprisingly many of the examples 
of metric spaces given in Chapter 3 can also be used as examples of linear spaces. 
Of course, one has to shift his attention. In Chapter 3 topological structure was 
in the limelight and all other structure that happened to be present in an example 
was carefully ignored. Now this chapter brings linear space structure into the 
forefront. 

We leave it to the reader to show that, considering the obvious definitions of 
addition and scalar multiplication, the following are examples of linear spaces: 
R, C, R", C", 1, with 1 < p < oo, and C[0,7]. Further, we invite the reader to find 
other linear spaces among the examples and exercises of Chapter 3. 


EXAMPLE 1. The plane: Let V? be the set of all vectors ina plane P emanating 
from a point 0 in this plane. Three such vectors are illustrated in Figure 4.2.1. 


Defining addition, scalar multiplication, the origin, and negation in the usual way 
we have a linear space. Of course, the critical reader may wonder if we have really 


“——~ Plane P 


Figure 4.2.1. 


4,2. LINEAR SPACES AND LINEAR SUBSPACES'- 163 


defined anything. We have not really said what a plane or a vector is. Nevertheless, 
this example does present an intuitive picture of a linear space. J 


EXAMPLE 2. Many of the examples of linear spaces are function spaces and, 
us such, vector addition and scalar multiplication are defined in an obvious fashion. 
To explain, let S be any nonempty set and let X denote the collection of all scalar- 
valued functions defined on S. If x, ye X, then we define x + y by 


(x + y)(s) = x(s) + y(s), for all se S, 


which is certainly an element of X. Also if « is any scalar and xe X, then ax is 
defined by 


(ax)(s) = a(x(s)), for all se S, 


which is also in X. In other words, we define these operations pointwise. It is a 
trivial matter, then, to show that X is a linear space. J 


EXAMPLE 3. Let ¥ = L,[0,T], 1 < p < ©, the set of all functions x defined 
on [0,7] such that 


{ or dt < ©. (4.2.1) 


Addition, scalar multiplication, the origin, and negation are defined as in Example 
2. Of course, here one does have to show that x + y and eax satisfy (4.2.1). 

There is an important point to be made about the above linear space. If x and 
y are points in L, and they differ only ona set of measure zero, then they are still 
different points in the linear space; and in most situations it is not desirable to 
distinguish between them. We confronted this situation before in Example 10 of 
Section 3.3 when we considered L,[0,7] as a metric space. We handle it here in 
exactly the same way. In particular, we use the “* equality ”’ 


T 
x= ye | |x(t)— y(nl? dt =0. 
0 
Given this equality, the linear space structure follows in a natural way. J 


EXAMPLE 4. Let (Q,¥,P) be a probability space, and let L,(Q,F,P) 
denote the set of all complex-valued random variables X defined on (Q,F,P) 
such that 


E{|X|?} < 00, 


where E denotes the expectation operation. Addition and scalar multiplication are 
defined in the natural way. Indeed, this is a generalization of Example 3. We even 
have a ‘‘ new equality’’ here too, namely 


X=YeoR{|\X-—YP}=0. J 


164 ALGEBRAIC STRUCTURE 


EXERCISES 
1. Let X be the linear space R*. For what values of r, if any, is the set 
A, = {xe R*:x, +x, 4+%3+x%,=7r}, 


where r is a real number, a linear subspace of X? For what values of r, if 
any, is the set 


B,= {xe Rt: x7 + x27 +x37 + x47 =1r7} 


a linear subspace of X? 


2. Let X be the linear space made up of all complex-valued functions T(s) defined 
on the imaginary axis of the complex plane such that 


< 0, 


if. IT(s)|? ds 


— i900 


where the integral is along the imaginary axis. That is, X = L,(—ioo, ico). 
Let A be the set made up of all rational functions, that is, all functions of the 
form 


Aps" +: +4, 


a= bos" +++ +b,” 


where ay # 0, 6) 4 0, m and n are integers. Is the set A a linear subspace of X? 
Next consider the subset B of A made up of all rational functions with n > m. 
Is B a linear subspace of X? What about the subset C of B made up of all 
functions with all their finite poles in the left-hand plane? 


3. Show that the set of all n x m matrices can be viewed as a linear space. 


4. Let X be the linear space C[0,7]. Which, if any, of the following subsets of 
X are linear subspaces? 


B, = {x € C[0,T]: x(0) = x(T)}, 
B, = {x € C[0,7]: x(0) = x(T) = 0}, 
B, = {xe C[0,T): x(t,) = x(t.) forall t,, ¢, such that t, + t, = T}, 
B, = {x € C[0,T]: x(0) = 1}, 
Bs = {x €C[0,T]: {§x(t) dt = 1}, 
Be = {x € C[0,T]: |x(t,) — x(tz)| < 10|t, — t| for all t,, t, € [0,7 ]}. 
5. Show that if {B,} is a family of linear subspaces of a linear space X, then 
B= (), B, is a linear subspace of X. What about U, B,? 


6. Let X be the linear space made up of all real-valued sequences. Show that 
A,, the set of all sequences that have a finite number of nonzero entries only, 
is a linear subspace of X. Show that A,, the set of all sequences that have an 
infinite number of nonzero entries, is not a linear subspace of X. 


4.3. LINEAR TRANSFORMATIONS 165 


7. Often in systems theory the linear space L,(— 00,00) is a good mathematical 
model for the set X of all inputs to a system as well as the set Y containing 
the range. Let . be the set of all mappings (linear and nonlinear) of X into 
Y. Show that ./ can be viewed as a linear space. Show that the subset @ < oW/ 
of all mappings representing causal (Section 2.8) systems is a linear subspace 
of &. 


8. Let X be the linear space made up of all absolutely convergent sequences of 
real numbers. Show that B, the set of all absolutely convergent sequences of 
real numbers with limit zero, is a linear subspace of X. 


9. Let X be the set of all convergent sequences of real numbers. Is X a linear 
space ? 

10. Let X denote the collection of all real-valued Lipschitz-continuous functions 
x(t) defined for —co <t< oo. That is, x(t) satisfies |x(t) — x(s)| < k|t — s| 
for some constant &k (which depends on x) and for all t, s. Show that X is a 
real linear space. 


3. LINEAR TRANSFORMATIONS 


Continuous transformations play a central role in the case of metric spaces. 
Likewise there is a special class of transformations that plays a central role in the 
case of linear spaces, namely linear transformations. 


4.3.1 DEFINITION. A transformation L of a linear space X into a linear space 
Y, where X and Y have the same scalar field, is said to be a /inear transformation if 


(i) L(ax) = «L(x) for all x e X and all scalars a, and 
(ii) L(x, + x.) = L(x,) + L(x2) for all x,, x. € X. 


Otherwise it is said to be a nonlinear transformation. 

The scalar multiplication operations on the left- and right-hand sides of (1) 
above are, of course, those of X and Y, respectively. Similarly, the addition opera- 
tions in (ii) are from X and Y, respectively. Please note carefully that (i) and (ii) 
must be satisfied for all x’s and «’s, not just some of them! 

It might appear that one of the conditions (i) and (ii) implied the other; 
however, this is not so. For example, if X = Y = C, the complex numbers and L 
is the operation of complex conjugation, then L(z, + Zz.) = L(z,) + L(z2) for all 
Z,;, Z, € C, but for complex scalars a with Ima # 0 we have L(az) # «L(z). Going 
the other way, let X¥ = R? and Y= R. Define a mapping G: X > Y by 


= xX, + X2, if x,x, >0 
G[(x,,x2)] = 5 otherwise, 


where x = (x,,x,) € R*. The mapping G has the property that G(ax) = «G(x) for 
all x and real «. On the other hand, G(x + y) does not equal G(x) + G(y) for all 


166 ALGEBRAIC STRUCTURE 


x, y€ R*. For example, if x = (1,0) and y = (0,1), then G(x) =0, G(y) =0, and 
G(x + y) = 2. 

Note that it only makes sense to ask whether or not L: X —> Y is linear if X 
and Y are both linear spaces and over the same scalar field. Otherwise, it 1s a 
meaningless question. Further, if we have a mapping S of a linear space X into a 
linear space Y, be it linear or not, it does not make sense to ask whether or not S 
is continuous. This question becomes meaningful only after we have added topo- 
logical structure, as we shall see in the next chapter. 

Up to this point nothing has been said that would guarantee that even onc 
linear transformation exists. We circumvent the general existence problem by 
merely exhibiting some examples of linear transformations later in this section. 
It is possible to show that given any two nontrivial linear spaces X and Y over the 
same scalar field, there always exists a nontrivial linear transformation L: X—> Y. 
The proof of this theorem is outlined in Exercise 5 at the end of this section. 

The null space, N(L), of a linear transformation L: X > Y is the subset of XY 
defined by 


N(L) = {x Ee X: Lx = 0}. 


The origin of X is, of course, always in WV(L), that is, L(0) = 0. A much more 
interesting fact is that (L) is always a linear subspace of X. The range space 
A(L) of a linear transformation L: X > Y is 


AL) = {y= Lx: xe X}. 


We note here that since L 1s linear, A(L) is a linear subspace of Y. 
A reason for the importance of linear transformations is embodied in the 
following lemma. 


4.3.2 LEMMA. A transformation L of X into Y, where X and Y are linear 
spaces over the same scalar field, is linear if and only if 


L(a,x, + +++ + 4,X,) = «,L(%,) + -°* +04, L(%,) 


for all x1, X,...,X,€ X, all scalars a, @,,..., %,, and all finite n. 


The proof of this lemma follows immediately from Definition 4.3.1. 

Lemma 4.3.2 shows that knowledge of how L transforms a finite number of 
points {x,,...,x,} allows a very simple characterization of how L transforms 
every point of the form a,x, + °+* +4, x,. This fact, which is sometimes referred 
to as the principle of superposition, has far-reaching ramifications. 

It should be carefully noted that nothing has been said in Lemma 4.3.2 about 
expressions of the form ) 7% , «;x;. Again, infinite series are not meaningful in the 
present linear space context, devoid of topological structure. Moreover, even if 
there were topological structure present, a transformation L being linear does not 
imply that 


L{ ¥a1x1 = Sa, L(x,). 
= i= 


4,3. LINEAR TRANSFORMATIONS’ 167 


lor example, differentiating an infinite series of functions term-by-term 1s not 
necessarily the same as differentiating the limit function. This is an important 
point, for the principle of superposition or linearity 1s sometimes mistakenly 
called upon to justify steps that just cannot be justified on the basis of linearity 
alone. 

Now let us consider some examples of linear transformations. 


EXAMPLE |. Consider the spring and mass system shown in Figure 4.3.1. 
With no applied force f the equilibrium or rest position is x = 0. Assume now that 
there is viscous friction between the mass and the surface it slides on, and that this 
viscous friction force is modeled by —b(dx/dt). The combined restoring force of 


Mass, m 


-_—_ 


Applied 
Force, f 


Position 


Figure 4.3.1, 


the springs is modeled by —kx. We assume that we are interested in modeling 
the behavior of this system for times f > 0 and that x(0) = 0 and (dx/dt)(0) = 0. 
As is well-known, f(t) and x(t) are related by the differential equation 


2 


d d 
(2 ah 


; 4.3.1 
Wi oF + kx (4.3.1) 


We view f as a point in C[0,0o), the set of all real-valued continuous functions 
defined on [0, 00). Solving (4.3.1) with the initial conditions x(0) = 0 and x’(0) = 0 
one gets 


x(t) = | = aGds: (4.3.2) 
where 


1 
h ae Air Aor 
aa 


and J, and Jd, are roots of the equation 


mi* + bl +k =0. 


' This problem is considered again in Chapter 5 (Theorem 5.6.2) where we have a topological as 
well as a linear space structure. 


168 ALGEBRAIC STRUCTURE 


(We assume here that 4, and 4, are real and different. Otherwise A(r) has a slightly 
different form.) Equation (4.3.2) then defines a mapping L: C[0,oo) > C[0,0o), 
where x = Lf. (Show that L is linear.) 

Equation (4.3.1) really only describes the physical situation for relatively small 
motions x. One might construct a more globally applicable mathematical model 
by changing the mathematical model of the restoring force to include the effect 
of a spring being completely compressed. For example, instead of a linear spring 
—kx one might use — g(x), where g(x) would model the abrupt increase in restoring 
force occurring when a spring is completely compressed. We then would obtain 

2 


f=m me +b . + g(x) = H(x), (4.3.3) 
where H: C?[0,00) > C[0,00) and C?[0,00) is the linear subspace of C[0,co) made 
up of all functions with continuous first and second derivatives. 

Presumably (4.3.3) is a better mathematical model than (4.3.1) in that it more 
completely describes the physical system. On the other hand, H is a nonlinear 
transformation, and it is usually not possible to represent the inverse of H, if it 
exists, as simply as L is represented for (4.3.1). We have, then, a typical example of 
the often conflicting goals of mathematical modeling: (1) a complete description 
of the physical situation versus (2) a mathematically tractable model. J 


EXAMPLE 2. Let L,(—i00,i00) be the linear space defined in Example 13 of 
Section 3.3. Define a mapping L on L,(—io0,ioo) by 


Y(iw) = H(iw) X(iw), for all a, 


where H(iw) is a bounded, measurable function. It immediately follows that 
AL) <— L,(—io,ioo). It is also easily shown that L is linear. Needless to say, this 
type of operation occurs very often in Fourier analysis. J 


EXAMPLE 3. Suppose that we are interested in the temperature of an infinite 
bar as a function of time, ¢, and position, x. Denote the temperature by T(x,*). 
Further, assume that the bar is being heated along its length by a distributed heat 
source. The heat supplied per unit length at x and ¢ is denoted (x,t). (See Figure 
4.3.2.) Assume that at t=0, T(x,0) = 0. Assume that @ is a point in the linear 
space X made up of all bounded continuous, real-valued functions defined on 
(— 00,00) x [0,00). It is well-known that 


T(x,t) = f- [ H(x — x', t— 1’)6(x',t’) dx! dt’, (4.3.4) 


where 


a: 
ex /4t 


K fort>0O 


H(x,t) = ft 
0, fort < 0, 


4.3. LINEAR TRANSFORMATIONS’ 169 


(x,t), Heat Supplied T(x, t), Temperature 


Infinite Bar 


Figure 4.3.2. 


and K is a constant. We leave it to the reader to show that (4.3.4) represents a 
linear transformation of X into itself. J 


EXAMPLE 4. In Section 2.8 we defined causal systems. Here we would like 
{o show an important fact about causality in linear systems. Let X be a linear 
space made up of functions x defined on the real line. For each time 7, let X; 
denote the linear subspace of XY made up of all functions x such that x(t) = 0 for? 
( < T. Further, let L be a linear mapping of X into itself. We claim that L is causal 
if and only if each linear subspace X; is invariant under L, that is, if and only if 
I(X7) < X, for all T. 

First suppose that L is causal, and let T be any fixed time. Then let x be any 
point in X,. We want to show that Lx e X;. Since L is causal and x(t) = x,(t) 
for t < T, where xq is the zero input, (Lx)(t) = (Lx,)(t) for t < T. But Lis linear so 
(LLXo)(t) = 0 for all t. Hence, (Lx)(t) = 0 for t< Tand Lxe X;>. 

Now suppose that for each T the linear subspace X; is invariant under L. 
let x, and x, be any two inputs such that x,(t) = x,(t) for t < T. Letting y, = Lx, 
und y, = Lx,, we want to showthaty,(t) = y,(t)fort < T. Buty, — y, = L(x, — x2) 
und (x, — x.) € X,. Since X; is invariant under L, (y, — y2) € X7 or y,(t) = y(t) 
lort<T. 

It should be mentioned that if Z is not linear, then the subspaces X;, being 
invariant under L does not imply causality. The next example shows this. J 


EXAMPLE 5. (This example is a continuation of the preceding one.) Let M 
denote the linear subspace of X defined by 
M= U) XT, 
T 
that is, x € M if it vanishes to the left of some finite time. Let us assume that M 


is not all of X. For example, M4 X if X = L,(— 00,00). We now define a mapping 
A of X into itself by defining it on M and M’ = X — M. In particular, let 


(Ax)(t) = x(t — 1) forxe M 
und 
(Ax)(t) = x(t + 1) for xe X— M. 


‘ In spaces such as L2(— 00,00) we say that x(t) = 0 almost everywhere in (— 00,7). 


170 ALGEBRAIC STRUCTURE 


Clearly the linear subspaces X; are invariant under A. Moreover, if there are 
just two distinct inputs x, and x, in(X¥ — M) and atime T such that x,(t) = x,(t) 
for t< 7, then A is not causal. J 


EXERCISES 


1. 


Let X and Y be linear spaces over the same scalar field. Show that I: X > X, 
the identity transformation, and 0: X > Y, the zero transformation, are linear. 


. Let X and Y be linear spaces over the same scalar field. Show that /t(_X, Y], the 


set of all linear transformations of X into Y, is a linear space when addition of 
linear transformations and multiplication of linear transformations by scalars 
are defined as in Example 2, Section 2. 


. Let X be an arbitrary nonempty set and let Y be a linear space. Show F¥, the 


set of all mappings of X into Y, is a linear space when addition of mappings 
and scalar multiples of mappings are defined as in Example 2, Section 2. 


. Let X, Y, and Z be linear spaces over the same scalar field, and let L,;: X > Y 


and L,: YZ be linear. Show that the composition L,L,: X — Z is linear. 


. Let X and Y be linear spaces over the same scalar field, and let M be a proper 


linear subspace of X. Further, let f be a linear transformation of M into Y. 
Show that there exists a linear extension F of f defined on X, that is, F: X¥ > Y 
and F(x) = f(x) for each x e M. [Hint: Let x, be a point in XY but not in M, and 
let M, be the linear subspace of X made up of all points x + ax), where xe M 
and « is scalar. Show that there is a unique expression x + aX, for each point 
in M,. Define a linear transformation Fy of Mo into Y by 


Fio(x + 4X9) = f(x) + aVo, 


where yp is an arbitrary point in Y. Then Fo is an extension of f to M,. Let 
E be the class of all linear transformations U with domain Q(U) c X, range 
RU) c Y, and which are linear extensions of f. Next introduce the following 
partial ordering on E: U < Vif QU) c BV) and U(x) = V(x) for all x e BV). 
Let C be an arbitrary totally ordered subset of FE. Show that C has an upper 
bound. Finally apply Zorn’s lemma.] 


. Suppose that we consider a system whose output is a delayed version of the 


input. That is, if x(t) is the input, then the output y(t) = x(¢ — t), where t is a 
constant. Let X be the linear space C(—0o,00) of continuous real-valued 
functions defined on (— 00,00). Let D denote the system operation. Is D a linear 
transformation of X into itself? Suppose that instead of being constant the 
delay t is given by t =e '. Do we havea linear transformation ? Then suppose 
that t = exp [—J*,|x(2)|d], where of course the linear space X must be selec- 
ted so that the integral exists. Do we have a linear transformation? 


. Let Y = C([0,0o), R”) be the linear space made up of all continuous mappings 


of [0,00) into R", that is, each component is continuous. Let X¥ = C'([0,0o), R") 
be the linear subspace of Y made up of all elements of Y with continuous deriva- 


4.4. INVERSE TRANSFORMATIONS 171 


tives, that is, each component has a continuous derivative. Does the expression 
y = Tx, where 


and A is areal n x n matrix, represent a linear transformation of X into Y? 


8. Let Y = BC(—00,00) denote the space of all bounded real-valued continuous 
functions y(t) defined for —co <t< oo and let X denote the space of all 
Lipschitz continuous functions; see Exercise 10, Section 2. Define x = Ly by 
x(t) = | y(s)ds. Show that L is a linear mapping of Y into X. 


4. INVERSE TRANSFORMATIONS 


The concept of the inverse mapping, which is a set-theoretic concept, was 
discussed in Section 2.7. Let X and Y be sets and let G: X > Y bea transformation. 
Recall (Theorem 2.7.6) that G is invertible if and only if G is a one-to-one mapping 
of X onto Y, that is, @(G) = Y. Recall also that G has a left inverse on Y if and 
only if G is one-to-one, and that G has a right inverse on Y if and only ifA(G) = Y. 

In this section we ask what more can be said when X and Y are linear spaces 
und the mapping G is a linear transformation. The first result is an elegant, although 
simple, statement about a linear transformation being one-to-one. 


4.4.1 THEOREM. Let L: X > Y bea linear transformation on two linear spaces 
X and Y. Then the transformation L is one-to-one if and only if the null space is 
trivial, that is W(L) = {0}. 


Proof: Consider the “ only if”’ part first, that is, assume that L is one-to-one, 
and let xe W(L). Since 
Lx =0 and LO=0, 


it follows from the one-to-one assumption that x = 0. Hence V(L) = {0}. 
Now consider the “‘1f”’ part, that is, assume that /(L) = {0}. Let x,, x, be 
points in X. Since LZ is linear, one has 


Lx, = Lx, L(x, — x2) = O0(%, — x2) €e W(L) = {0} 


— xX =X. 


Hence L is one-to-one. J 


The next result tells us that the inverse of a linear transformation (when it 
exists) is necessarily linear. 


4.4.2 THEOREM. Let L: X > Y be an invertible linear transformation of X onto 
Y, where X and Y are linear spaces. Then the inverse L~': Y > X is linear. 


172 ALGEBRAIC STRUCTURE 


Proof: Let y, =Lx, and y, = Lx,, where y,, y, e Y. Using the fact that L 
is linear and L~'L = identity on XY we have 


L-*(y, + yo) = Lo *(Lx, + Lx) = L7*L(x, + x2) 
=— x4 + X> — Bey, of Loty,. 
Hence L~' is additive. Similarly one has 
Lo ay,) = Lo '(aLx,) = LL (ax,) =ax,= aly. 


Hence L~' is linear. J 


EXAMPLE 1. Let us return to the operator L given by 
Y(iw) = H(iw)X(io), (4.4.1) 


which was discussed in Example 2, Section 3. But now assume that H(i@) is bounded 
and continuous. 
One can show that L is one-to-one if and only if the set 


N = {@: H(ia) = 0} 


has measure zero. 
It is well-known that operators like L arise in the theory of Fourier transforms. 
In particular if one takes the Fourier transform of the convolution equation 


y(1) = uc ~ s)x(s) ds, 


where x, ye L,(— 0,0) and he L,(—,00) one arrives at (4.4.1) where Y, H, 
and X are the Fourier transforms of y, h, and x, respectively. J 


EXERCISES 


1. Show that the linear transformation y = Lx on L,(— 0,00) given by 
t 
y(t) = { a~'e-t-Dx(z) dr 


is one-to-one. [Hint: Show that Lx =0 reduces to a ex(t) dt =0. Then 
differentiate and use Theorem D.13.3.] 


2. Let k(t,s) be continuous for 0 < s < t < T and consider 


y(t) = x(t) + f k(t,s)y(s) ds. (4.4.2) 


The following steps will lead to a proof that the relationship y = Fx implicitly 
given by (4.4.2) does define a linear mapping F on C[0,«] provided «@ is a sufli- 
ciently small positive number. 

(a) Define G by x = Gy, where 


t 
x() = y() ~ J k(s)y(s) ds. 
Show that G is linear. 


4,5. ISOMORPHISMS 173 


(b) Assume that for some a>0, G maps C[0,a] onto itself (cf. Exercise 7, 
Section 3.15). Show that if G is one-to-one, then G~' exists and is F, so F 
is linear. 

(c) Let M satisfy |k(t,s)|SM for O0<s<t=<T, then show that Gy=0 
reduces to 


ly(t)| <M fi y(s)| ds. 


Now use the Gronwall Inequality, see Cesari [1, p. 35], to show that G is 
one-to-one. 


3. Show that if L: X— Y is left (right) invertible, then it has a linear left (right) 
inverse. Further, show that it is possible for a left (right) inverse of a linear 
transformation to be nonlinear. 


5. ISOMORPHISMS 


One of the central concepts of mathematics is the concept that two mathe- 
matical structures are equivalent if they can be put into one-to-one correspondence 
in a way that preserves structure. We have already used it in several ways. For 
cxample, in Chapter 3 we introduced the concept of isometric metric spaces. 
Recall that (X,d,) and (Y,d,) are isometric if (i) there exists an invertible mapping 
F of (X,d,) onto (Y,d,) such that (il) d,(F(x,), F(x2)) = d,(%,,x,) for all x, 
x, € X. Item (i) is the one-to-one correspondence, and (11) is the structure pre- 
servation. This means that the only difference between two isometric metric spaces 
is in the names given to the elements in their underlying sets. 

An analogous situation arises when we deal with linear spaces. Often two 
superficially different linear spaces are only different in the nature of the points 
in their underlying sets. Let us make this concept precise. 


4.5.1 DEFINITION. The linear spaces X and Y over the same scalar field F 
re said to be isomorphic if there exists a one-to-one linear mapping T of X onto Y. 
The mapping T is then said to be an isomorphism of X onto Y. 


Obviously T puts X and Y into a one-to-one correspondence and it preserves 
linear space structure. It follows from Theorems 4.4.1 and 4.4.2 that T: X~ Y 
is an isomorphism if and only if (i) T is one-to-one; (11) T maps X onto Y; (iil) 
Tis linear; and (iv) T~* is linear. 

Note there is no demand in Definition 4.5.1 that the isomorphism between 
(wo isomorphic linear spaces be unique. As a matter of fact, an infinite number of 
isomorphisms usually exist between two isomorphic linear spaces. This is one of 
the points illustrated in the following examples. 


EXAMPLE 1. Let X = R? and let V? be the plane discussed in Example 1, 
Section 2. The linear spaces R* and V? are isomorphic. One isomorphism, call it 
1',;, mapping R? onto V? can be defined as follows. Let v, and v, be any two non- 
collinear vectors in V? as illustrated in Figure 4.5.1. Then if x = (x,,x,) is a point 
in R?, define 7, by 


T(x) = x10, + 202, 


174 ALGEBRAIC STRUCTURE 


trate ate 
tee 


V2 


Figure 4.5.1. 


where the operations on the right are the scalar multiplication and addition 
operations of V”. Since the vectors v, and v, can be chosen almost arbitrarily, it 
is also clear that there is an infinite number of isomorphisms between R? and V’”. 
Another way of saying this is that there is an infinite number of different coordinate 
systems possible for V*. Jj 


EXAMPLE 2. Let ¥ = R’, and let Z be the linear space made up of all func- 
tions z defined on [0,7] of the form z(t) = ay + a,t, where ay and a, are arbitrary 
real numbers. The linear space structure on Z is, of course, defined in the obvious 
way. These two linear spaces are isomorphic. One isomorphism, call it 7,, mapping 
R? onto Z is defined by 


T(x) = xX, + X2t, 


where x = (x,,x,). We note in passing that the inverse of 7, can be represented by 


a) 


It can also be shown that Z here and V? in Example | are isomorphic. (How?) J 


d 
7.12) = (20), | 


EXAMPLE 3. Let X denote the collection of all functions ¢(t) that satisfy 
the differential equation 


L(x) = dg x + ax") +--+ +a,x =0, (4.5.1) 


where x“) = d/x|dt/ and the coefficients {a),...,@,} are real constants and ay # 0). 
It is easily seen that X is a linear space. Furthermore, if we let Ot), i= 1, 2,..., 0, 
denote the solution of (4.5.1) that satisfies ¢;“~ (0) = 1 and 


6,0) =0, j=0,1,...,n.-landj#i-l, 
then any solution @$(t) of (4.5.1) can be written as 


P(t) = cy O,(t) + 8+ + Cy b(O), (4.5.2) 


4.5. ISOMORPHISMS 175 


where the coefficients {c,,...,¢,} are real constants. In fact c; is determined by 
c, = ¢“~ (0), where f° = @. We claim that ¥ is isomorphic with R". The iso- 
morphism 7: X —> R" is given by 


T(?) = (C,. say )s 


where ¢ satisfies (4.5.2). The details are left as an exercise. J 


EXAMPLE 4. Let X = /,(— 00,0) and Z be the linear space made up of all 
complex-valued functions f(z) defined on the unit circle of the complex plane 


such that 
1 
2ti 


for S <0. 


The usual assumption is made that if f,(z) and /,(z) differ only on a set of measure 
zero, then f, and f, are considered to be the same function. 
We claim that the following formula where 


oe (. ae ees, sive ‘ .) > f(z) 
f= ¥ e270 (4.5.3) 


defines an isomorphism #% of X onto Z. The reader probably recognizes that 
(4.5.3) is the two-sided z-transform. We remark that the convergence in (4.5.3) is 
convergence in the mean, that is, 
2d 
=| =0. 
Z 


1 M 
li — _ > 

lim $f dfn? 
We will delay the proof of the statement that # is an isomorphism until Chapter 5. 
Here we merely mention that the inverse of z can be represented by 


d 
= bef, k=...,-1,0,1,2... 0 
2ni Z 


KXERCISES 


|. Let y be the set of all linear spaces over a scalar field F. Show that the relation 
X ~ Y=(X and Y are isomorphic) is an equivalence relation and, therefore, 
induces a partition on x. 


2. Referring to Example 2 above, under what conditions on real numbers c¢,,, 
C12, C21, C22 does 


T(X) = (Cy, %y + Cy. Xz) + (Cap Xy + C22 Xb, 


where x = (x,,x,), define an isomorphism from R* onto Z? 


176 ALGEBRAIC STRUCTURE 


3. Show that the real linear space X¥ made up of all functions of the form x = 
acos(wt + @), where @ is fixed, is isomorphic to the complex numbers con- 
sidered as a real linear space. (This fact is the cornerstone of the so-called 
phasor method of analyzing alternating current electrical networks.) 


4, Let L,°(—o,00) denote the linear space made up of all complex-valued 
functions x defined on (— 00,00) such that 


ie 6) 
2,—26t 
9 
| |x|“e dt < 
— © 


where o is a real number. Show that L,°(— 00,0) is isomorphic to L,*( — 00,00) 
for any o and t. 


5. Show that L,(— 00,00) is isomorphic to L,[0,0o). 
6. Show that /,(— 00,00) is isomorphic with /,(0,00). 


6. LINEAR INDEPENDENCE AND DEPENDENCE 


Linear independence is a property attributable to sets of points in a linear 
space. This section is devoted to presenting a precise formulation of this concept. 
We shall use expressions of the form a,x, + ++: + 4,x, often, so we introduce 
the following definition. 


4.6.1 DEFINITION. Let A be a set (perhaps infinite) in a linear space X. A 
point x € X 1s said to be a linear combination of points in A if there exists a finite 
set of points {x,,x2,...,x,}1n A and a finite set of scalars {a,,0,,...,a,} such that 


KS OX He OO Xe (4.6.1) 


If the set A is empty, we ag-ee that the origin 0 is the unique point that is a linear 
combination of “ points in A.”’ 

It should be noted that the expression for x in (4.6.1) may not be uniquely 
determined. Furthermore, it is important to note that no matter what the nature 
of the linear space X (that is, be it finite or infinite dimensional) or the set A, we 
consider finite linear combinations only. 


EXAMPLE 1. Let X be the linear space C[0,7], and let A be the infinite set 
containing the continuous functions {1,f,t7,¢°,...}. The set of all linear combina- 
tions of points in A is the set of all polynomials in ¢, that is, all functions of the 
form 


x(t) =aAp +at+a,t?+°°:+4,0", te [0,7], 
where dy, @,,..., a, are scalars andu=0,1,2,3,.... J 
Let A be a set in a linear space X and let V (A) be the set of all (finite) linear 


combinations of points in A. The important fact about the set V(A) ts that it is a 
linear subspace of X. We refer to it as the linear subspace spanned by A or simply 


4.6. LINEAR INDEPENDENCE AND DEPENDENCE 177 


the span of A. We leave it to the reader to show that V (A) is the “‘smallest”’ linear 
subspace of X containing A; that is, if M is a linear subspace of X¥ and Ac M, 
then V(A) c M. 

Now for the definitions of linear independence and linear dependence. 


4.6.2 DEFINITION. A set A ina linear space X is said to be linearly independent 
if for each point x in A, x is not a linear combination of points in the set A — {x}, 
that is, A with x removed. In other words, x is not in the linear subspace V(A — {x}). 
A set A in a linear space X is said to be /inearly dependent if it is not linearly in- 
dependent, that is if there exists at least one point x in A such that x is a linear 
combination of points in the set A — {x}. 


EXAMPLE 2. Let A be the set in a linear space X containing only the origin: 
A = {0}. Then the set A — {0} is the empty set. But we have agreed to say that the 
origin is a linear combination of points in the empty set, so by Definition 4.6.2, 
the set A is linearly dependent. If Bis an arbitrary set in X, then {0} U Bis linearly 
dependent. (Why ?) If A is the empty set, then A 1s linearly independent.(Why?) J 


EXAMPLE 3. Let A and X be the same as in Example 1. Let x = r* be an 
arbitrary point in A. Is x a linear combination of points in A — {x}? If it is, there 
exist integers k,,...,k,, not equal to k, and nonzero scalars a,,..., a, such that 


P(t)=t*—at—----a,t™=0 ~~ te[0,7]. 
But this is clearly impossible since this polynomial can have only a finite number of 
zeros. Hence the set A is linearly independent. J 


The following theorem generalizes the method used in Example 3. 


4.6.3 THEOREM. A Set A ina linear space X is linearly independent if and only 
if for each nonempty finite subset of A, say {X,...,X,}, the only n-tuple of scalars 
satisfying the equation 


AX, + A,X, + °°° +4,x, =9 (4.6.2) 
is the trivial solution a, = +--+ =a, = 0. 
Proof: Let us do the “only if” part first, that is, assume that A is linearly 


independent. Let {x,,...,x,} be a finite collection of distinct points from A. 
Assume now that the equation 


A,X, ++: +4,x, = 90 (4.6.3) 


has a nontrivial solution for the scalars a,,...,a,, that is, at least one of the 
scalars is nonzero. There is no loss in generality in assuming that a, 4 0. If we let 
bh, = —a,/a, for i=2,...,n, then one has 


X, = bX, + °':' + 5,%,, (4.6.4) 


178 ALGEBRAIC STRUCTURE 


that is, x, is a linear combination of {x,,...,x,}. This contradicts the fact that 
A is linearly independent. Hence (4.6.2) has the trivial solution a, = -:: =a, =0 
as its only solution. 

Now for the “‘if”’ part of the theorem. Assume that the only solution of (4.6.2) 
is a, = --- =a, =0. If a point x, in A can be written as a linear combination of 
points in A — {x,}, then (4.6.4) holds for some points {x,,...,x,} in A where 
xX, A#X;,2<i<n. But this implies that (4.6.2) has the nontrivial solution a, = 1, 
az = —b,,...,a, = —b,, which is a contradiction. Hence A is linearly indepen- 
dent. Jj 


The following is an obvious corollary to Theorem 4.6.3. 


4.6.4 COROLLARY. A nonempty set A ina linear space X is linearly dependent 
if and only if there is at least one nonempty finite subset of A, say {X,,...,X,}, and 
scalars a,,..., &, where not all a; are zero, such that 


AX, +s: +a,Xx, = 9. 


Let us now turn to another way of characterizing linear independence. We 
motivate the next theorem with the following example. 


EXAMPLE 4. Let X be the linear space R*, and let A be the set {x,,x,,x3}, 
where x, = (1,0,0), x, = (0,1,0), x3 =(0,0,1). The set A is linearly independent, 
and V(A), the span of A, is R® itself. If x =(a,,a),a3) is an arbitrary point in 
R°, there is one and only one way to express x in the form 


x= XX; + XoXo + A3X3, 


namely, 0, = a,, 4%, =a, 03 =. 

Continuing with the example, let C be the set { y,,y.,y3}, where y, = (1,0,0), 
2 = (0,1,0), vy; = (1,1,0). The set C is linearly dependent, and V(C) is the set of all 
points of the form (a,,a, ,0). Now if y is a point in V(C), there is more than one 
way to express it in the form 


Y= Bi, + Boy2 + B3)3- 
For example, if y = (3,1,0), then one has 
YH=2V, t+ V2 =Wt+IV3= —Yrzt+2y3. |F 


The next theorem shows that the uniqueness of representation (or expression) 
concept illustrated by the preceding example characterizes linear independence. 


4.6.5 THEOREM. Let A be a nonempty set in a linear space X. The set A is 
linearly independent if and only if for each x #0 in V(A), there is one and only one 
finite subset of A, say {X1,X2,...,X,}, and a unique n-tuple of nonzero scalars, say 
{@,, a2, ..., a,}, such that 


X=4,xX,; + nie + a,X,- 


4.6. LINEAR INDEPENDENCE AND DEPENDENCE 179 
Proof: Let us consider the “‘only if’ part first, that 1s, assume that A is 
linearly independent. Let x e V(A) where x 4 0, and assume that 


X= A,X, + 1+ t+ Ayxy = by, +++ + by yy, 


where {x,,...,Xy} and {),,...,¥ay} are two sets in A and the coefficients a,,...,ay , 
b,,.--, Dy are all nonzero. We then want to show two things: First that the sets 
{x1,---,Xy} and {y,,...,Vy} are the same. (From this fact we can, and do, assume 
that x, =y,,...,Xy =yy and N=M.) Secondly, we want to show that 
Q, = b,,..., dy = by. 

First we note that 


AX, + +++ + AyXy — by, —*+°* — by ym =x -— x =), 


which is a special form of Equation (4.6.2). Since a, 4 0 and since A is linearly in- 
dependent, Theorem 4.6.3 assures us that x, must be included in the set { y,,... ,Vag}, 
say that x, = y,, and that a, =5,. Similarly, since a, #0, we see that x, lies 
in {y,,..., Vy}, say that x, = y,, and that a, = b,. By continuing in this fashion 
we see that the representation for x is unique. 

Now consider the “‘if”’ part of the theorem. Assume that for each x in V(A) 
the sets {x,,...,x,} and {a,,...,a,} are unique. We must show that A is linearly 
independent. Let xg be any point in A. Trivially x» = x9, and by our assumption 
this is the only way to express X, as a linear combination of points in A. But then 
Xo Is not a linear combination of points in A — {x 9}. Hence, A is linearly in- 
dependent. § 


Yet another characterization of linear independence is given by the following 
theorem. 


4.6.6 THEOREM. Let A be a set in a linear space X. The set A is linearly 
independent if and only if there is no proper subset Ay of A such that V(Aj) = V(A). 


We leave the proof of this theorem to the reader. 

This is probably not the first time that the reader has confronted the concept 
of linear independence. Perhaps the only new aspects are the extension of this 
concept to linear spaces made up of arbitrary objects (for example, functions, 
sequences, random variables) and to infinite-dimensional linear spaces. Moreover, it 
should now be clear that linear independence is an algebraic concept and does not 
involve topological structure. 

The following example illustrates the concept of a linear subspace being 
spanned by a set. Moreover, it is the basis for some later examples in this book. 


EXAMPLE 5. Let X =/1,(— 00,0). Let S.: X > X be the right shift, that is, 
PMS in op aah keane AN 1 SS XS ic VV Ga) is o5eeeh> Chen 


Va = Xe-4 (eee 0 2.3;). 


180 ALGEBRAIC STRUCTURE 


In other words, S, “‘ shifts’? the sequence x to the right by one position. As usual 
we denote the composition of S, with itself n times by S,”. S, 1s clearly invertible, 
so S," with a negative n is meaningful. Moreover, S.° = J. 

Let x be any nonzero point in X, and consider the set A, of all ye X that can 
be expressed in the form 


where N, M=..., —2, —1,0, 1, 2,3,... and the «,’s are scalars. The set A, is 
obviously a linear subspace of X. Indeed, it is the subspace spanned by the set 


f ipauda. “Me (ASS. icieks 


We can use the two-sided z-transform (Example 4, Section 5) to give a simple 
characterization of the subspace A,. Let Z denote the linear space of functions 
used in Example 4, Section 5, and let # denote the two-sided z-transform, that 1s, 


CO 
F (asgx aso isiws) = » ee 


k=—-© 


where the convergence is convergence in the mean. 

We remarked in Example 4, Section 5 that # 1s an isomorphism of X onto Z. 
Hence, if we characterize the linear subspace #(A,) in Z that will be equivalent to 
characterizing A, in X. But 


¥(S.x)=z7'F(x) and F(S,"x) =z°"F(x) 


SO 
M M 
2 > o, s,"x] = | y a2") (x), 
n=N n=N 
In other words, each point in the linear subspace #(A,) of Z is of the form 


p(x) (4.6.5) 


where p is a polynomial in z and 1/z given by 
M 
p(z) = Do a,z-". (4.6.6) 
n=N 


Or, a point in X is in the subspace A, if and only if its z-transform is of the form 
(4.6.5). 

We remark in passing that it should be obvious that 4, 4 X. In fact, if x and 
y are points in X such that the ratio #(x)/#(y) is not of the form (4.6.6), then 
the subspaces generated by x and y are disjoint, that is, A, NA,= {0}. J 


4.6. LINEAR INDEPENDENCE AND DEPENDENCE 18] 


EXERCISES 


1. Let T be an isomorphic mapping of a linear space X onto a linear space Y. 
Show that Ac ¥X is linearly independent if and only if its image T(A) < Y 
is linearly independent. 


2. Let ¥ = L(Q, F, P), the linear space made up of all complex-valued random 
variables x defined on a probability space (Q, ¥, P) such that 


E{xx} = xX dP = | Ix|? dP, 
Q Q 
where xX denotes the complex conjugate of x. Let A < X be the set containing 
two random variables x, and x, such that 
E{x,X,;} = 1, E{x,X2} = 1, E{x,x2} = 0. 


Show that y € X is a linear combination of points in A if and only if 


E{ y¥} = |E{ X31? + LE{ yX2}1?. 
Moreover, show that A is linearly independent. 


3. Let X be the linear space L,[0,22] and A the set of all functions x,(t) = e'", 
n=0,1,2,.... Show that A is linearly independent. [Hint: Assume that 
ae +--+ +a,,e""' = 0. Differentiate (m — 1) times.] 


4. Show that a finite set A = {x,,...,x,} 1n a linear space X is linearly indepen- 
dent if and only if the only n-tuple of scalars satisfying the equation 


AX, ++ +a,x, =0 
Isa, =°*: =a,=0. 
5. Let X be the linear space R", and let A be the set containing the n vectors 


X= {%11.%215--- Xnr}s 


X = {X2.X205.-- Xnats 


Xn = (Ane ons ik nals 


Show that this set is linearly independent if and only if 


X14 M20 °° Xd 
xX xX eee xX 

det 21 22 2n x 0. 
Xn Xn2 Si Xnn 


6. Show that a set A in a linear space X is linearly independent if and only if 
every finite subset of A is linearly independent. 


7. Prove Theorem 4.6.6. 


182 ALGEBRAIC STRUCTURE 


8. 


10. 


1]. 


12. 


13. 


14. 


Let the state of a dynamic system be a point in the linear space ¥ = R". Let 
the state at time &k = 0, 1, 2,... be denoted by x,. Further suppose that the 
evolution of the system 1s characterized by a linear transformation T: X > X; 
in particular, x, = Tx,_,. Let x, be an arbitrary initial state, and consider the 
set 


Aisa T Xe Meal Xo skuct 
Show that there exists an integer p such that for 
A, = {Xo ,TXo,..-,1?Xo} 
one has V(A) = V(A,), that is, A and A, span exactly the same linear subspace. 


. Let X be the linear space made up of all real-valued random variables defined 


on some probability space. Let A be a set of random variables in XY. Show that 
if a random variable z € X is stochastically independent of each random variable 
in A, then z is not in V(A). 


Show that Theorem 4.6.5 remains true even when the set A is empty or when 
we consider the point x = O in V(A). 


Let X be a linear space. A set K c X is said to be convex if 
Ax+(1—A)yEK (O<A< 1), 


whenever x, ye K. Let K, and K, be two convex sets in X. 
(a) Show that K, a K, is convex. 
(b) Is K, U K, convex? 


Let y = f(x) be a C? function defined for —co <x < o. Find a condition 
on d*f/dx* in order that 


K = {(x,y) € R*: y > f(x)} 
be convex. 
Show that the polyhedron 


Py = |x = (ein) 812 O and ¥ x=] 
i=1 


is convex in R". 


(Continuation of Exercise 13.) Let (A,,...,4,) €P,,- 
(a) Show that 


and show that equality holds if and only if 4; = 1/n, 1 <i<n. [Hint: Use 
mathematical induction. ] 

(b) Some stockbrokers recommend the method of “dollar cost averaging“ 
for periodic investments, Engel [1; pp. 181 ff]. Give a mathematical 
description of this method and use the inequality above to show that this 


4.7. HAMEL BASES AND DIMENSION 183 


method results in a lower cost per share than the method of buying an 
equal number of shares at each investment period. 


15. Let A c X where X is a linear space. 
(a) Show that V(A) is the smallest linear subspace of XY that contains A. That 
is, Show that if M is a linear subspace of X with A c M, then V(A) c M. 
(b) Show that A is a linear subspace of X if and only if A = V(A). 


7. HAMEL BASES AND DIMENSION 


We have used the terms “finite dimensional” and “infinite dimensional” 
somewhat casually up to this point. One thing we do in this section is give precise 
meanings to these terms. The other thing we do is introduce the concept of a 
Hamel basis for a linear space. A Hamel basis is important as we shall see, for 
several reasons. First it is the natural concept of basis for spaces that have linear 
structure only. Secondly, it allows one to distinguish between finite- and infinite- 
dimensional linear spaces. Indeed, we shall use exactly the same distinction in 
normed linear spaces. 

On the other hand, Hamel basis is not the only concept of basis that arises in 
analysis. There are concepts of basis that involve topological as well as linear 
structure. Although these other bases reduce to Hamel bases on finite-dimensional 
linear spaces, they are usually quite different from Hamel bases on infinite- 
dimensional spaces. In fact, in applications involving infinite-dimensional spaces 
a useful basis, if one even exists, is usually something other than a Hamel basis. 
For example, a complete orthonormal set is far more useful in an infinite-dimen- 
sional Hilbert space than a Hamel basis. 

A Hamel basis, then, is a purely algebraic concept that serves many important 
purposes, but it is not the last word on bases. Having said this, let us see what one ts. 


4.7.1 DEFINITION. A set B in a linear space X is said to be a Hamel basis 
for X if (i) B is linearly independent and (ii) V(B) = X, that is, the span of B 
is X itself. 


Of course, this definition is simply a generalization of the familiar concept of 
coordinate system. 


EXAMPLE 1. In the plane a set B containing any two noncollinear vectors 1s 
a Hamel basis or coordinate system for the plane. J 


EXAMPLE 2. Let X be the real linear space made up of sequences x = 
(x,,X2,...) such that 5 2 ,|x,|? < oo. Let A = {e,,e,,...} where e; is the sequence 
€; = (6;1,0;2,...) and 6;; is the Kronecker function. It is easily shown that A is 
linearly independent, so one suspects that A is a basis in the sense of Definition 
4.7.1. It is not! Since we allow ourselves only finite sums, a,x; + --: + a,x,, we 
see that V(A) # X. In fact, V(A) is the linear subspace of X made up of all sequences 


184 ALGEBRAIC STRUCTURE 


that are nonzero on only a finite number of entries. The reader may complain, 
saying he knows how to form infinite linear combinations of points in A to obtain 
arbitrary points in X, that is, 


While this is true, it is important to realize that the theory of infinite series requires 
more than just the linear space structure. We shall return to this point in Sections 
5.17 and 5.18. J 


So far we know only that some linear spaces have Hamel bases. The next 
theorem implies that every linear space has a Hamel basis. 


4.7.2 THEOREM. If A is a linearly independent set in a linear space X, then 
there exists a Hamel basis B for X such that A cB. 


Since every linear space contains the empty set and the empty set is linearly 
independent, it follows from this theorem that every linear space has a Hamel 
basis. Of course, this theorem just says that a basis exists. It does not tell us how 
to find a basis. Since it is not crucial that the reader know the proof of this theorem, 
the proof is omitted here and outlined in the exercises. (Also see Appendix C.) 

At this point a rather disconcerting thought should occur to the reader. If a 
linear space X has several different Hamel bases, do some Hamel bases have 
“fewer” points than others? Happily the answer to this question is no. In a 
meaningful sense all Hamel bases of a linear space ¥ contain the same number of 
points. 


4.7.3 THEOREM. If B, and B, are Hamel bases for a linear space X, then 
B, and B, have the same cardinal number. 


Recall that two sets have the same cardinal number if they can be put into a 
one-to-one correspondence with one another. The reader unfamiliar with the 
concept of cardinal numbers need merely note that intuitively this is a perfectly 
reasonable way to characterize the fact that two sets contain the same number of 
points. (See Appendix B.) 

We shall not prove Theorem 4.7.3 here. One of the exercises at the end of this 
section sketches the proof for this finite-dimensional case and another problem 
sketches it for the infinite-dimensional case. 

Theorems 4.7.2 and 4.7.3 furnish the foundation for a meaningful concept of 
dimension. 


4.7.4 DEFINITION. The cardinal number of any Hamel basis of a linear space 
X is said to be the dimension of X. We denote the dimension of X by dim(X). 


4.7. HAMEL BASES AND DIMENSION’ 185 


Again Theorem 4.7.2 shows that every linear space has a Hamel basis, and 
Theorem 4.7.3 shows that dimension is a property of the linear space in question 
and not dependent on the particular Hamel basis considered. If the dimension of 
X is finite, we say that X is a finite-dimensional linear space. Otherwise, we say that 
X is an infinite-dimensional linear space. This then is the distinction between finite- 
and infinite-dimensional linear spaces no matter what additional structure (for 
example, a norm) may be present. 

The following theorem shows that for a given dimension there is essentially 
only one kind of linear space over a given scalar field. 


4.7.5 THEOREM. If X, and X, are linear spaces over the same scalar field, 
then X, and X, are isomorphic if and only if dim X, = dim X,. 


Proof: Suppose X, and X, are isomorphic, and let B, be a Hamel basis for 
X,. Let f: X, + X, be an isomorphism of X, onto X,. We leave it to the reader 
to show that B, = /f(B,) is a Hamel basis for X,. Then since f 1s one-to-one, 
card(B,) = card(B,); so dim(X,) = dim(X,). 
Now assume that dim(Y,) = dim(X,). Let B, and B, be Hamel bases for 
X, and X,, respectively. Since dim(X,) = dim(X,), there is a one-to-one mapping 
f of B, onto B,. (That is what having the same cardinal number means.) Using 
this correspondence f, we now define a linear mapping f: X; > X,: Let x be any 
point in X,, x 4 0. By Theorem 4.6.5, x can be expressed uniquely in the form 


X= A,X, + > + A,Xn> 
where x;€ B, anda,;40,i=1,...,n. We let 


f(x) =a, f(x) +++ + af (xr) 
f(0) = 0. 


It follows immediately from the fact that B, is a basis that f is one-to-one and 
maps X, onto X,. Moreover, f is clearly linear. Hence, f is an isomorphism and 
X, and X, are isomorphic. J 


and 


4.7.6 COROLLARY. If X is a finite-dimensional linear space over a scalar 
field F, where dim(X) =n, then X is isomorphic to F", the linear space made up of 
ordered n-tuples of scalars. 


In other words, all n-dimensional real linear spaces are isomorphic to R"”, and 
all complex ones to C”. 

Both the foregoing results deserve serious consideration. They raise the 
following question. Since all linear spaces of a given dimension over a given scalar 
field are essentially the same linear space, why bother to study and discuss linear 
spaces at an abstract level? The alternative would be to study a typical linear space 
for each dimension. For instance, with finite-dimensional spaces we could limit 
our study to F”. There are two important reasons for not doing so. 


186 ALGEBRAIC STRUCTURE 


First, each time we proved something about, say F”, we would have to show 
that it held for all linear spaces isomorphic to F". Otherwise, we would not be 
making a statement about n-dimensional spaces in general. Needless to say, this 
would not be a saving of effort. 

Secondly, in many applications, in fact in most, there is other structure (for 
example, topological) present, and this other structure is usually germane to the 
application. Attempting consistently to recast such problems in terms of “‘ preferred 
linear space formulations’’ would be an extremely awkward practice. 

The following theorem is an important statement about the dimensions of the 
range and null space of a linear transformation. 


4.7.7 THEOREM. Let L: X- Y be a linear transformation where X and Y 
are linear spaces. Then the dimension of NL), AL), and X are related by the formula 


dim{ /(L)} + dim{A(L)} = dim{X}. 


This result is particularly useful when X is finite dimensional. 


Proof: Let By be a Hamel basis for (ZL). Then there exists (Theorem 
4.7.2) a Hamel basis B, for X such that B, c By. Let B={yeBy: y¢éB,}. 
One clearly has V(B) 0 /(L) = {0}. Furthermore, the set L(B) in Y spans the 
range of L. (Why?) We assert that L(B) is a linearly independent set in Y. To see 
this we consider the equation 


B,L(x,) + +++ + B, LO&,) = 9, (4.7.1) 


where {x,,...,X,} 1S a finite set of distinct elements in B. Since L is linear (4.7.1) 
implies that 
L(B, x, ae? te BnXn) = 0 


or (B,x, + °°: + B,X,)€ V(L). Since the point (6.x, + --- + £,x,) 18 also in 
V (B) one then has 
ByX, +++ + BX, = 9. (4.7.2) 


Since B is linearly independent, it follows from Theorem 4.6.3 that 6; = +--+: = 
B, = 0. By applying Theorem 4.6.3 again to Equation (4.7.1) we see that L(B) is 
linearly independent in Y. 

Since L(B) is linearly independent and spans A&(L) it follows that L(B) is a 
Hamel basis for @(L). The conclusion now follows from standard cardinal arith- 
metic (see the exercises in Appendix B). J 


EXERCISES 


1. Let X be a finite-dimensional space with dim(X) =n. Show that every set 
containing n + | points is linearly dependent. 

2. Let A be a linear subspace of a linear space Y. Show that dim(A) < dim(X). 
Moreover, if X is finite dimensional and A is a proper lincar subspace of YX, 
show that dim(A) < dim(YX). 


4. 


5 


6. 


4.7. HAMEL BASES AND DIMENSION 187 


Consider the following differential equation defined on C’[0,00) 
d*x dx 
— + b— = 0. de 
Fe + i + cx (4.7.3) 


If X denotes the set of all solutions of (4.7.3) show that X is a linear subspace 
of C’[0,0o) and that dim (X) = 2. 

Show that if A is a set in a linear space XY with V(A) = X, then A contains a 
Hamel basis of X. 


Let X be the real linear space made up of all functions of the form x(t) = 
acos(wt + @), where wis fixed. Show that B = {cos wt, sin wt} is a basis for X. 


Prove Theorem 4.7.2. [Hint: Let P be the class made up of all linear independent 
sets in the linear space YX, that is, an element of P is a linearly independent 
set in X. Let P be partially ordered by inclusion. Then use Zorn’s lemma. 
Compare with Exercise 5, Section 3.] 


Prove Theorem 4.7.3 for the finite-dimensional case. [Hint: Assume that 
B, = {x,,...,X,}1S a finite Hamel basis for X, and let B, be any other Hamel 
basis of X. Let x; be a point B,, and let B,(x;) be the unique finite set (Theorem 
4.6.5) of points in B, needed to express x. Show that 


B, = B,(x,) UV Ba(x2) U ++: U Ba(X,) 


and that B, is finite. Let B, = {y,,..., ¥,,$. The point y, is a linear combina- 
tion of points in B,, that is, y, = a,x, +--+: +a,x,. Argue that at least 
one of the coefficients a,,...,a, 1S nonzero; for example, a, #0. Then 
x, = (1/a,) y, — (a,/a,)x, — +--+ — (a,/a,)x,. Deleting x,, the set y,, x2,..., X, 
spans X. Continue along this line of argument to show that n>m. Then 
reverse the argument and show that m > n.] 


. Prove Theorem 4.7.3 for the infinite-dimensional case. [Hint: For each x € B,, 


let B,(x) be the unique finite set of points in B, needed to express x. Show that 
y € B, implies that y e B(x) for some x. Then show that 


B,= \) B(x). 


xeB, 


Then show that card(B,) < card(B,). Now reverse the roles of B, and B, 
to get card(B,) < card(B,).] 


. Show that C[0,7] is infinite dimensional. [Hint: Construct a linearly indepen- 


dent set of dimension n, where n is an arbitrary integer.] 


. Let L be a linear transformation of X into Y where X and Y are both finite 


dimensional. 

(a) Show that L maps X onto Y if and only if dim&(L) = dim Y. 

(b) Show that L is one-to-one if and only if dim A(L) = dim X. 

(c) Show that L is invertible if and only if dim X¥ = dim Y= dim &(L). 
(d) What can onc say about infinite-dimensional spaces? 


188 ALGEBRAIC STRUCTURE 


8. THE USE OF MATRICES TO REPRESENT LINEAR 
TRANSFORMATIONS 


Let us now turn to the important topic of the representation of linear trans- 
formations by matrices. Let X and Y be finite dimensional, and let T: X¥ > Y be 
linear. Roughly speaking, if Tx = y; x is expanded in terms of a basis of X, y is 
expanded in terms of a basis of Y, then a matrix is used to relate the coefficients in 
these two expansions. 


Let B, = {x1,X2,...,x,} and B, = {)1,V2,---,¥m} be bases for X and Y, 
respectively. We know from Section 7 that x e X and any ye Y can be expressed 
uniquely in the form 


X= HX, + °° + 4,,X,, 
and 


Y=Bry, to + Bam: 
respectively. Thus for any x € X, 7(x) can uniquely be expressed 
T(x) = Bry +0 + Ba Vm 
and from the linearity of T 
T(x) = 0,T(x,) + +++ +a,T(x,). 
But T(x,),..., T(x,) are points in Y, so they can be uniquely expressed 
T(%) = thas tot Hh Im> 
T(X2) = tay ttt + tn2Vm> 


T (x,) a Ciny1 ste Cnn m> 
where the ¢;,’s are scalars. Therefore, we have 


T(x) = Bin eee t HPV 
and 


T(x) = y(t te 82 a tnt Vm) 1 et Sp Onlin 1 Sa Lin Vin) 
= (11 Se a te tin On) V1 5 ae (t1n1% ae Sele On) Vacs 
Again the expansion of T(x) in terms of the basis B, is unique. Therefore, 
By = by, Hh. toe thin 
Ba = toy + ty2Gg tte + bon, 


Bm = bin %4 + Linz %2 a ed + bas an. 
Or in terms of matrices 
B, thy oty2 tt Cyn oa 


B, — | far 22 °° lan |] 2 


Bm or binn a, 


4.8. MATRICES TO REPRESENT LINEAR TRANSFORMATIONS 189 


We say that the matrix 


[7] = |i m = 


Lint Linn 


represents the linear transformation 7: X > Y. Carefully note that T and [7] are 
not the same thing. T is a rule for assigning y’s to x’s, while [T] is an m x n array 
of scalars. The matrix [7] represents the linear transformation T. More precisely, 
[7] represents T relative to the bases B, and B,, in the sense that [T] together 
with B, and B, can be used to solve the equation y = Tx. 


EXAMPLE 1. Let X be the linear space made up of all third degree poly- 
nomials, that is, all x(t) of the form 


x(t) = 0, to,.t +0317 + a4t°, —00 <t< oo. 
Let Y be the linear space made up of all second degree polynomials, that is, 
y(t) = By + Bot + B3t?. 


Let D be the derivative operation restricted to X. Clearly A(D) = Y and D is 
linear. One basis for X is 


By = {X1,X2 ,X3,X4}, 
where x, = 1, x, = 4, x3 = 07, x, =. Similarly, a basis for Y is 


By = {Vis V2> V3}; 


where y, = 1, y. = ¢t, y3 = ¢t?. Then 


D(x,) = 0, 
D(X2) = 
D(x3) = 2y2, 
D(x4) = 3y3. 
Thus 
1 0 1 0 Ol] a, 
B.)}=|]0 0 2 Ol] a, 
B; 0 0 0 3)] a, 


4 


and relative to the bases B, and B, the linear transformation D is represented by 


the matrix 
0100 
[D]J=|0 0 2 O}. 


00 0 3 


190 ALGEBRAIC STRUCTURE 


Of course, if one changes bases, the matrix that represents D will probably 
change. For example, if instead of B, we let B, = {l+¢,¢+27, 7? +e°,14+2} 
be the basis for X, then 

D(x1) = V1; 
D(X2) = y1, + 2y2, 
D(x3) = 2y2 + 3y3, 
D(xX4) = 3y3, 
and relative to B, and B, the transformation D is now represented by 


lL 100 
[D]=|0 2 2 O}. 
00 3 3 


Carefully note that in each case the same transformation, D: X > Y, is being 


represented. Roughly speaking, the matrices change because we change our 
coordinate systems. J 


EXERCISES 


1. Let [T] be an m x n matrix of scalars, and let X¥ and Y be linear spaces where 
dim(X) = 2 and dim( Y) = m. Show that there exists a linear transformation 
T: X > Y such that [7] represents T relative to some Hamel basis. 


2. Let X and Y be finite-dimensional linear spaces with dim X¥ = nanddim Y = m. 
Consider the linear space /t{ X,Y] of all linear transformation T: X > Y and 
the linear space M,,,,, of all m x n matrices of scalars. The relation “‘[7'] repre- 
sents T relative to B, and B,” is a mapping of /t[X,Y] into M,,,. Show that 
it is an isomorphism. In other words, /tL. X,Y] and M,,,, are isomorphic. 


3. Let X, Y, and Z be linear spaces over the same scalar field, and let B,, B,, 
and B, be bases for X, Y, and Z, respectively. Let 7,: X > Y and 7,: YZ 
be linear. Show that the matrix [T,7,] that represents the composition 7,7, : 
X — Z relative to B, and B; is [7,7,] =[T,][7,] where [7,] represents 7, 


relative to B, and B,,[7T,] represents 7, relative to B, and B,, and [T,][7, | 
denotes the usual matrix product. 


4. Let X be a finite-dimensional linear space and let B, = {x,,...,x,} and 
B,={)1,---,¥n$} be two bases for X. Thus any xe X can be expressed 
uniquely in the form 

X= OX, +o + a,x, 
or 


x= Bi, ea te By Ves 
Show that an n x n matrix can be used to represent the transformation from 


the B, to B, coordinate system. Moreover, show that the same n x ” matrix 


is the representation of the identity transformation /: X¥ > X relative to B, 
and B,. 


4.8. MATRICES TO REPRESENT LINEAR TRANSFORMATIONS 191 


5. Let X and Y be linear spaces over the same scalar field, and let B, and B, 


be countable Hamel bases for X and Y, respectively. Further, let 7: ¥ — Y be 
linear. Show that an infinite matrix can be used to represent 7. 


. Discuss the use of matrices to represent (in the sense of this section) the linear 


transformations that model linear (time-invariant and time-varying) sampled 
data systems. 


. Let X be the linear space made up of all functions ®(7,x) of the form 


D(t,x) = a, + (Gayl + yx) + (340? + 32 tx + 33x°) + °° 
+ (a, 00-2 + a, ot" 2) x 4 oses a NY. 
where (¢,x) € [0,7] x [0,L], m is a fixed positive integer, and the a’s are 
scalars. What is the dimension of X? Show that the equation 
Oo k 67 
at Ox? 
(where k is a constant) represents a linear transformation of YX into itself. 


Represent this linear transformation with a matrix. Is the transformation 
one-to-one? Does it map X onto itself? 


. Consider the linear operator K: X¥ — X given by y = Kx, where 


27 
y(t) = { k(t,s)x(s)ds, k(t.) = 4 cos A(t — 5), 
0 
and X is the linear space spanned by {1, cos s, cos 2s, sin s, sin 2s}. 
(a) Express K as a matrix. 
(b) Is K one-to-one? 
(c) Does it map X onto itself? 


. Let L be a linear transformation of X onto Y where X is finite dimensional. 


Let [L] be a matrix representing L. Show that if L is one-to-one, then [L] is 
a square matrix and det [L] 4 0. 


. Let X¥ = Y denote the space of all fourth degree polynomials in ¢ and define 


L:X—> Y by L= D*? +2D +1 where D is the differential operator, that is, 


(a) Represent L by a matrix L in terms of the basis {1,7,17,2°,77} on X and Y. 

(b) Represent L? = LL in terms of this basis. Show that [L*] = [L][L] in terms 
of the usual matrix product. 

(c) Repeat steps (a) and (b) for M=«aD?+ BD + yl. 


. Let X, be the linear space of all functions of the form {a + a, cos t+ f, sint} 


for 0 <1? < 2x and define L: X, — X, by y = Lx where 


2n 
(1) = J “LL + cos(t — s)]x(s) ds. 


192 ALGEBRAIC STRUCTURE 


(a) Represent Z as a matrix operator. 
(b) Do the same where X;, is replaced by X,, the collection of all functions 


of the form {a + a, cost +a, cos 2t + f, sint + B, sin 2t} and L maps 
X, into itself. 


(c) Do the same for the operator M: X,— X, given by 
2 
y(t) = { 


0 


1 + cos(t — s) + sin 2(t + s)]x(s) ds. 


12. Let X, denote the space of all polynomials in ¢ of degree <n. 
(a) Consider the operator 


d dx 
=Lxoy(t) = —(t? —1) — 
y=Lxe y(t) 7 | a 


on X,. Find a representation of L: X,— X, with respect to the basis 


3 1 5 3 
A=|1 5555. 


(b) Find a basis B for X, such that the operator 


d? d. 
y= Axe y(t) = a2 ue 


can be represented by a diagonal matrix. 

(c) Can the operators L and H be represented by diagonal matrices on X, ? 
(The operators L and A generate the Legendre polynomials and the Her- 
mite polynomials, respectively. See Section 7.14 and Exercise 4, Section 7.5.) 


9. EQUIVALENT LINEAR TRANSFORMATIONS 


Two linear spaces are viewed as being essentially the same linear space if 
they are isomorphic to one another. In more or less the same spirit two linear 
transformations can be essentially the same. The idea is to connect the two by 
means of isomorphic transformations of their domains and ranges. Let T: X > Y 
and 7:%-+Y be linear transformations. Further suppose that X and & are 
isomorphic and that Y and ¥ are also isomorphic. Let U: X > and W: Yo %, 
be the isomorphisms. The situation is illustrated in Figure 4.9.1. We see that we 


Figure 4.9.1, 


4.9. EQUIVALENT LINEAR TRANSFORMATIONS 193 


have two mappings of X into Y; namely, Tand T’ = W~'ZU. Similarly, Z and 
J' = WTU~' are two mappings of & into Y. One does not expect T and 7” or 
J and J’ to be the same. However, it can happen—and this is very interesting 
when it does—that one has T = T’, or equivalently that 7 = 7". 


EXAMPLE 1. Let ¥= Y=L,(—o,00) and  =Y=L,(—io,ioo) and let 
T: X > Y be 


t 
(Tx)(t) = i e“"-Ox(q)dt, —o<t<o, 


and7:%-WY be 


(7 x)(iw) = 


Ree ae —0 <@< ©. 
It is well-known that ¥, the Fourier transform, is an isomorphic mapping of 
L,(—0, 00) onto L,(—io0,io00) (Section 5.19) and that T=F% ‘FF and 
T=FTF'. | 


Let us formalize the above remarks in a definition. 


4.9.1 DEFINITION. Let XY, 2%, Y,¥Y be linear spaces over the same scalar 
field where X and & are isomorphic and Y and ¥Y are isomorphic. The linear 
transformations T: X > Yand J: X —¥Y are said to be isomorphically equivalent 
(to one another) if there exist isomorphisms U: X > % and W: YY such that 


T=W'T7U 
and 
FJ =WwTu". 


It is trivial to show that T= W-'ZU if and only if 7 = WTU™". 
There is a special case of two linear transformations being isomorphically 
equivalent which ts particularly important; namely, similarity. 


4.9.2 DEFINITION. Let X and 2% be isomorphic linear spaces. The linear 
transformations T: X > X and 7: %-—-€ are said to be similar if there exists a 
isomorphism U: X + & such that 


T=U''FU 
and 
JF SUFU 
The situation described in this definition is illustrated in Figure 4.9.2. 
It is difficult to overemphasize the importance of the two concepts presented 
in Definitions 4.9.1 and 4.9.2. In fact, these concepts are basic to the entire theory 


of linear operators. Given a linear transformation 7: X > Y (or T: X > X) one 
tries to find an isomorphically equivalent (or similar) transformation 7: #~ Y 


194 ALGEBRAIC STRUCTURE 


Figure 4.9.2. 


(or 7: X >) which is somehow easier to work with than T. For example, it is 
usually easier to work with the transfer functions obtained by Fourier or Laplace 
transforms than to work with the corresponding differential or integral operators. 
We end this section with a warning. The meanings of the terms ‘* isomorphically 
equivalent”? and “‘similar’’ are those given in Definitions 4.9.1 and 4.9.2 and 
absolutely nothing more. They are algebraic concepts. Two linear transformations 
T and 7 that are isomorphically equivalent or similar are essentially the same 
linear transformation from an algebraic point of view. Yet in specific cases two 
isomorphically equivalent transformations can differ in a world of ways not covered 
by these definitions. To name just one, there is the whole question of topological 
structure that may be present. We shall study this further in the next chapter. 


EXERCISES 


1. Show that the entire construction given in Section 8 can be viewed as a special 
case of linear transformations being isomorphically equivalent. [Hint: Given 
bases B, and B, for X and Y, respectively, let U: X > F" and W: YF" 
denote the operations of finding the “‘coordinates”’ of points in X and Y, 
respectively. Then show that 7: X > Y is isomorphically equivalent to a linear 
matrix transformation 7: F”—> F™.] 


2. Let X be the sequence space /,(— 00,00) and let Z be the same as the linear 
space Z in Example 4, Section 5. Let S,: X — X be the right shift operator. Let 
FJ :z—-z be defined by (Zy)(z)=z 'y(z) for |zZ|}=1. Show that ZF and 
S, are similar. [Hint: Use the z-transform.] 


3. Let X be a linear space, and let /t{.X,X] be the linear space of linear transfor- 
mations of X into itself. Show that the relation L, ~ L, defined on /t(X,X] by 
L,~L,<{L, and L, are similar} is an equivalence relation. Let /t( X,Y] 
denote the linear space of all linear transformations of X into Y. Show that 
{L, and L, are isomorphically equivalent} is an equivalence relation on /t[X, Y]. 


4. Let X, Y, &, Y be finite-dimensional linear spaces over the same scalar field, 
where X and & are isomorphic and Y and ¥Y are isomorphic. Let T: X — Y and 


4.9. EQUIVALENT LINEAR TRANSFORMATIONS 195 


FT: &-—-+Y be linear. Show that T and J are isomorphically equivalent if and 
only if dim[A@(T)] = dim[A(Z )], where &(-) denotes the range. 


. Let S and T be linear operators on R* and assume that there is a basis for R? 
in which S and T have the following representations. 


1 | dh 0 
s=lo i} 7=[o ad 
(a) Show that S and 7 are not similar. 


(b) Are S and T ever isomorphically equivalent? 
. Define the 2 x 2 matrices o,, k = 1, 2, 3, by 


0 1 0 -i 1 0 
1-t1 of 971; of lq -1]' 


(These are referred to as the Pauli spin matrices.) 
(a) Show that: 


6,7 = 1 Kal, 2-3 


0,0, = —070, =I10;, 
0403 = —030, =10, (4.9.1) 
030; = —0,03=103. 


(b) Let S be a nonsingular 2 x 2 matrix and define t, by 1, =So,S~?, 
k = 1, 2, 3. Show that t,, t,, and 7, satisfy (4.9.1). 

(c) Let t,, t,, and t3 be three 2 x 2 matricesthat satisfy (4.9.1). Show that there 
is a nonsingular matrix S with the property that t, = So,S~', k = 1, 2, 3. 


. Let M and N be linear operators on C” that can be represented as diagonal 
matrices 


[M] = diag {1,,. 2° sUnts 
[N] = diag{v,,...,v,}5 


in terms of some basis. Show that M and WN are similar if and only if the two 
sets {p,,...,U,$ and {v,,...,v,} are the same. 


Part B 
Further Topics 


10. DIRECT SUMS AND SUMS 


In Chapter 3 we saw how two metric spaces can be put together to form a 
new metric space called the product space. In this section we shall do a similar 
thing for linear spaces. Suppose that XY, and YX, are linear spaces over the same 
scalar field. The linear spaces X, and X, can be combined to form a new linear 
Space referred to as the direct sum of X, and X, and denoted by X,@ X,. The 
underlying set of X, ® X, is the Cartesian product, X, x X,, of the underlying 
sets of X, and X,. Thus a point in X, @ X, is an ordered pair (x,,x,), where 
x, € X, and x, € X,. Addition is defined by (x,,x2) + (y,,¥2) = (41 + 15% + 2). 
Note that “+ ” on the left is defined in terms of the addition operations of X, and 
X, appearing on the right. Scalar multiplication is defined by a(x,, x.) =(ax,,0x3). 
The origin is (0,0) and —(x,,x,) = (—x,,—;). 

Let us consider some examples. 


EXAMPLE 1. Consider a system with two input channels (for example, 
settings of valves #1 and #2) and two output channels (for example, temperature 
#1 and flow rate #2) as shown in Figure 4.10.1. Assume that the function at 


Input Channel 
# | 


Output Channel 
# | 


Output Channel 
#2 


Input Channel System 
#2 


Figure 4.10.1. 


each channel is a point in the linear space C[0,7']. Then an input to the system— 
as opposed to one of the input channels—is an ordered pair of continuous func- 
tions, the input on channel #1 and the input on channel #2. Similarly assume 
that the system output is an ordered pair of continuous functions. Thus the mathe- 
matical model for the system would be a transformation of C[0,7] @ C[0,7] into 
itself. 


EXAMPLE 2. Let us see that the operation of multiplication can represent 
either a linear or nonlinear system. Suppose we have some device, see Figure 
4.10.2, that multiplies continuous functions of time together on a pointwise basis. 
If we let x,(t) = f(t), where fis a fixed point in C[0,7], and we consider the resulting 


196 


4.10. DIRECT SUMS AND SUMS" 197 


x,(t) 
Multiplier Xg (t) = x, (1) x2 (1), O<'t<T 


X2(t) 


Figure 4.10.2. 


mapping x,(t) > f(t)x,(t) of C[0,7] into itself, then this mapping is linear. On the 
other hand, if we consider the mapping of C[0,T7] @ C[0,7] into C[0,7] repre- 
sented by 


(X1,X2) > Xo = X4X2, 


this mapping is nonlinear. Note that whether or not the multiplier is ‘‘linear”’ 
depends on how it is being interpreted. J 


EXAMPLE 3. Similarly, addition can represent either a linear or nonlinear 
system. Suppose we have some device, see Figure 4.10.3, that adds continuous 


x(t) 
Adder Xp (=x, )+x,(0), OS 6 ST 


X(t) 


Figure 4.10.3, 


functions together on a pointwise basis. If we let x,(t) = f(t), fe C[0,7], then the 
mapping 
Xo(t) = 41 (1) + f O<t<T 
of C[0,7'] into itself is linear if and only if f(t) = 0. Butthe transformation 
(x; »X2) > Xo 
given by 
Xo(t) = x(t) + x2(¢) 

is a linear mapping of C[0,7] @ C[0,T] into C[0,7]. J 

Needless to say, we form the direct sum XY, ® X,@°°: @ X, of n linear spaces 
over the same scalar field in the obvious way. 

Often when we are discussing two linear spaces, X, and X,, over the same 
scalar field, they are in fact both linear subspaces of some containing linear space 


X. In that situation we have two ways to put XY, and X, together to form a new 
linear space: direct sum and (inner) sum. The sum of X, and X,, denoted X, + X2, 


198 ALGEBRAIC STRUCTURE 


is the linear subspace of XY made up of all points x = x, + x,, where x, € X, and 
X, € X,. That X, + X, is indeed a linear subspace of X is trivial. Note also that 
(a) X¥,¢ X,+ X, (b) X¥, 0X, 4+ Xp, (©) if Xo X,, then X, + X, = X2, and 
(d) X, + X, is the linear subspace spanned by the set X, U X2. 


EXAMPLE 4. Let X¥ = C[—T,T]; let X, be the linear subspace of X made up 
of all even functions (that is, x(t) = x(—f) for all te [—T7,T]); and let X, be the 
linear subspace of XY made up of all odd functions (that is, x(t) = —x(—1?) for all 
te [—T,T7]). In this case it turns out that X¥, + X, = X. Indeed, if x is any point 
in X, then 


_ x(t) + =) 2 x(t) — x(—1t) 


x(t) 4 oe ra x(t) + Xo(t), 


where x,€ X, andx,éEX,. J 


We say that two linear subspaces Y, and X, of a linear space X are disjoint? 
if X, 0 X, = {0}, that is, their intersection contains exactly one point, the origin. 


4.10.1 Lemma. Let X, and X, be linear subspaces of a linear space X. Then 
for each x in X, + X, there is a unique x, € X, and a unique x2, € X, such that 
xX =X, + Xx, if and only if X, and X, are disjoint. 


Proof: Consider the “if”? part. Assume that X, and X, are disjoint. Let 
X=X,+%.=),+)2, Where x,,y,E€X, and x,,y,€X,. Then x,—-y,= 
Y2 —X,. But (x, —y,)€ X, and (yz — x2) € X2, and X, A X, = {0}; therefore, 
xX, —y, =O and x, —y, = 0. 

Now consider the “‘ only if” part. Suppose that x, and x, are uniquely deter- 
mined for each x in X, + X,. We must show that it follows that X, and X, are 
disjoint. If X, and X, are not disjoint, there exists an x*, x* £0, in X, 9 X32. 
Then if x= x, + x2, it follows that x = (x, + ax*) + (x, — «x*) for any scalar 
a. But then x, and x, are not unique, which is a contradiction. J 


Although the sum X, + X, and the direct sum X, @ X, are different linear 
spaces, it is possible to compare them. In fact there is a natural mapping, call it 
dg, of X, ® X, into X, + X, defined by 


PL(X1,X%2)] = x, + Xp. 


We immediately note that ¢@ is a linear mapping of X,@ X, onto X,+ X,. 
However, ¢ need not be invertible, for it need not be one-to-one. But when ¢ is 
invertible, it follows that @ is an isomorphism and that X, + X, and X,@ xX, 
are isomorphic. We now seek conditions under which @ is an isomorphism, that 
is, conditions under which X, + X, and X, @ X, are algebraically equivalent. 


3 This is an example of a minor ambiguity one sometimes finds in mathematical terminology, 
that is, ‘‘ disjoint” linear subspaces are not ‘‘ disjoint’’ as sets since they must contain the origin. 
However, the dual usage of ‘‘disjoint”’ is so natural that it is now universally accepted. 


4.10. DIRECT SUMS AND SUMS" 199 


4.10.2 THEOREM. Let X, and X, be linear subspaces of a linear space X. The 
natural mapping p of X,@® X, onto X, + X, is an isomorphism if and only if X, 
and X, are disjoint. 


The proof of this theorem involves a simple application of Lemma 4.10.1, 
und we ask the reader to check this. 

Since X, + X, and X,@ X, are isomorphic when X, 0 X, = {0}, it is common 
practice when X, and X, are disjoint to refer to X, + X, as the direct sum of 
X, and X, and denote it by X, ® X,. Needless to say, this can be confusing and 
sometimes misleading. This practice is particularly dangerous when the containing 
linear space has topological structure or algebraic structure in addition to its 
linear space structure.* 

If X¥ = X, + X,, where X, and X, are disjoint linear spaces, then we shall 
say that X, is an algebraic complement of X,. The next result is simple, but we 
caution the reader to observe that it is an algebraic fact and not a topological fact. 


4.10.3 THEOREM. Let X, be a linear subspace of a linear space X. Then X, 
has an algebraic complement. 


Proof: Let B, be a Hamel basis for X, and choose B, so that B, An B, = @ 
and B= B, U B, is a Hamel basis for X. (See Theorem 4.7.2.) The linear space 
X, generated by B, is then an algebraic complement of X,. J 


One can show (see Exercise 5) that every algebraic complement of a linear 
subspace X, has the same dimension. Indeed, this follows directly from the last 
result in the case of a finite-dimensional space X. In other words, the dimension 
of the algebraic complements of X, is a property of X,. We refer to this dimension 
us the co-dimension of X,. Roughly speaking, as the dimension goes up, the 
co-dimension goes down and vice versa. 


EXAMPLE 5. In this example we show that two linear time-invariant map- 
pings do not necessarily commute. In particular, we exhibit two noncommuting 
linear time-invariant mappings 7, and 7, defined on a linear subspace X of 
[,(— 00,00). 

Let xel,(— 00,00), x £0, and let A, denote the linear subspace generated 
by x and all shifted versions of x as in Example 5, Section 6. We know from this 
cxample that we can find a ye/,(— 00,00), y #0, such that A,, the subspace 
generated by y, is disjoint from A,. Given x and y it 1s not difficult to show that 
one can also always find a z, z #0, such that the subspaces A,, A,, and A, are 
mutually disjoint. Let ¥ = A, + A, + A,. Since each of the A-subspaces is invariant 
under shifting, it follows that X is also. 

We define the linear time-invariant operator 7, on X by setting 


Tx=y, Tyy=z, and 7,z=x. 


* This issue arises again in Section 5.20. 


200 ALGEBRAIC STRUCTURE 


It follows from linearity and time-invariance that the above three conditions 
uniquely determine 7, on all of X. Similarly, we define the linear time-invariant 
operator T, on X by setting 


T,x=z, T,y=y, and 7,z=x. 


It then follows that 7,7,x = T,(y)=y and 7,7,x =T7,(z) =x, and this shows 
that 7, and 7, do not commute. 

We see then that linearity and time-invariance alone are not enough to guaran- 
tee that two operators commute. Let us note though that if such operators are also 
continuous (see Section 5.19), then they must commute. We will not prove this 
fact here, but the idea of the proof is simply to show (using transform techniques) 
that if 7, and 7, are two continuous, linear, time-invariant operators, then there 
is an invertible operator S such that 


STS: =H, and ST,S"'=H,, 


where H, and H, denote multiplication operators. [For example, if the base space 
for the operators 7, and 7, was L,(— 00,00) instead of /,(— 00,00), then S would 
be the Fourier transform, and H, and H, would have the form given in Example 
2, Section 3.] Since H, and H, commute, it iseasy to seethat T, andT,commute. J 


EXERCISES 
1. Let X=L,[—7,7], and let X¥, = V(A,) and X, = V(A,.), where 
A, = {1, cos t, cos 2t, ...} 
and 
A, = {sin f¢, sin 2t,...}. 


Show that X, ® X, and X, + X, are isomorphic. 


2. If X,, X,,..., X, are linear subspaces of a linear space YX, then there is a 
natural mapping @ of X¥,®X,@°:: @X, into X,+ X,+°::4+ X,. When 
is @ an isomorphism? 

3. Show that dim(Y, ® X,) = dim(XY,) + dim(X,). 

4. Let X, and X, be linear subspaces of a linear space X dim X < 00. Show where 


dim(X, + X,) = dim(X,) + dim(X,) — dim(X, 0 Y,). 


5. Let M be a linear subspace of a linear space XY. Show that if N, and N, are two 
algebraic complements of M, then dim N, = dim N,. (That is, the co-dimension 
of M depends only on M and X, but not on the choice of algebraic complement.) 

6. Let X be a linear space and assume that X = X, + --: + Xy where X,,..., Xx 
are linear subspaces of X with X; 0 X, = {0} for i#j. Let B, be a Hamel 
basis for X;. Show that B = B, U-:: U By is a Hamel basis for X. 


4.11. PROJECTIONS 201 


11. PROJECTIONS 


Consider the plane X shown in Figure 4.11.1 with the two designated one- 
dimensional linear subspaces M and N. Since M and N are disjoint, we know from 
Section 10 that each point x in the plane can be uniquely expressed in the form 


Figure 4.11.1. 


xX =X, +xX,, where x,¢M and x,eEN. Suppose we consider the mapping P 
which maps the plane X into itself and which is defined by P(x, + x.) = xy. 
Geometrically P projects the plane X onto the subspace M along the subspace N, 
that is, A(P) = M and W(P) = N. It is easily shown that P is linear. We see also 
that P? = P. 

We want to extract the essence of the above situation and define a general 
notion of projection. It turns out that what we are interested in hinges on the fact 
that P is linear and that P? = P. 


4.11.1 DEFINITION. A linear transformation P of a linear space _X into itself 
is said to be a projection if P* = P. 


We note that P* = P does not imply that P is linear. For example, the non- 
linear mapping f of R into itself defined by f(x) = +1 for x>1; f(x) = 0 for 
—~l1<x <1; and f(x) = —1 for x < —1 1s such that fof=/f/. 

The following three theorems show that the Definition 4.11.1 does lead to a 
concept of projection which agrees with our intuitive concept of projection. 


4.11.2 THEOREM. Let P be a projection defined on a linear space X. Then the 
range and the null space, A(P) and NP), are disjoint linear subspaces of X such that 
X=A&(P)+ N(P). That is, ACP) and N(P) are algebraic complements of one 
another. 


Proof: We already know, of course, that the range and null space of a linear 
transformation are linear subspaces. Let us show that A(P) and W(P) are disjoint. 
Let x€ A(P) 0 NV(P). Since x € A(P), there is a y such that Py =x, and P?y = 
Px =x. Since xe W(P), Px = 0 or x =0. Hence, A(P) A V(P) = {0}. Now let 


202 ALGEBRAIC STRUCTURE 


us show that A(P) + W(P) = X. Let x be an arbitrary point in X. Define y = Px 
and z= x — y. One then has x = y + z where y € A(P). However, Pz = P(x — y) = 
Px — P*x =0, thatis,ze V(P). J 


Since A(P) + W(P) = X and A(P) 0 V(P) = {0}, we see from Lemma 4.10.1 
that each x e X can be uniquely expressed in the form x = x, + x,, where x, € A(P) 
and x, €.W(P). Moreover, Px = P(x, + x.) = x,. Referring to Figure 4.11.1, we 
see that it makes sense in general to say that ‘“‘P is the projection onto the sub- 
space @(P) along the subspace W(P).”’ 

We note that if P is a projection, then so is J— P, with @UWU — P) = W(P) and 
MN (I — P) =A&(P). Hence, if x = x, + x,, where x, = Px, then x, = (J — P)x. 

Suppose that instead of starting with a projection as in Theorem 4.11.2, we 
start with two disjoint linear subspaces, M and N with Y¥ = M+N. Does there 
exist a projection P with @(P) = M and W(P) = N? The answer is yes. 


4.11.3 THEOREM. Let M and N be two disjoint linear subspaces of a linear 
space such that M + N = X. Then there exists a projection P defined on X such that 
RP) = M and N(P)=N. 


Proof: Again from Lemma 4.10.1 each xe X can be uniquely expressed 
xX =X, + x,, where x, e Mand x, € N. Let P be defined by P(x) = x,. P is obviously 
the desired projection. J 


Now let us start with one subspace. 


4.11.4 THEOREM. Let M be a linear space X. Then there exists a projection 
P such that AP) = M. 


Proof: This follows directly from Theorems 4.10.3 and 4.11.3. Jj 
Let us consider some examples of projections. 


EXAMPLE |. Let X(iw) be the Fourier transform of a signal x(t), where 
X(iw) € L,(—ioo, ico). Let P be the transformation of L,(—i00, ico) into itself 
defined by 


‘ _ X(ia), for —Wp SOS); Wo > 0 
ELAGO) = - otherwise. 


Those familiar with linear filter theory will recognize that P corresponds to an 
ideal low pass filter with unit gain in the passband (—@w,) < w <@,). They will 
also recall that P does not correspond to a causal system. We ask the reader to 
show that P is a projection. J 


EXAMPLE 2. Let X be a linear space, and let X,,...,X,, be linear subspaces 
of X such that X, and 27_, X, are disjoint for each é and with X= X, +--+ +X, 


/¥i 


4.11. PROJECTIONS 203 


let P,, 7=1,2,...,, be the projection on X for which & ( P,) = X; and N( Pi) = 
Nite +X +X +0 +X,. Then 

T=A,P, +A,P,+ °° +4,P,; (4.11.1) 
where 1,,..., 4, are scalars, defines a linear transformation of X into itself. The 
restriction T, of T to X,; is a mapping X, into X. The range of T; is X,;. Considering 
T',;as a mapping of X;, into itself, one has 7, = 4,J;, where J, is the identity operator 


on X;. Show that if Xis finite dimensional, then T can be represented by a diagonal 
matrix of the form 


A, 


Show that P; P; = 0 for i#j. Further, show that T is a projection if and only if 
4, is either 0 or 1 for j=1,...,7. (By the end of this book the reader will 
recognize that the construction given in this example is extremely important.) J 


EXAMPLE 3. Let X be the linear space L,(— 00,00), and for each time T 
Ict the transformation P, of X into itself be defined by 


_ {x(t), for -o <t<T 
a= 0, for T<t< o. 


iach P; is obviously a projection. J 


MXERCISES 


1. Let Y= L,(— 00,00) and consider the class of linear transformations of Y into 
itself than can be represented in the form y = Hx where H #0 and 


t 


y(t) = | h(t —1)x(t)dt te (—00,00), 
where / is in L,(— 00,00) and A(t) = 0 for t < 0. (The latter requirement guaran- 


tees that the operator is causal.) Show that no nonzero transformation in 
the class is a projection. [Hint: Use Fourier transform methods.> Let 


R(io) + iX(iw) = fhe" de. 


* The fact that Fourier transform methods are indeed rigorously applicable here is a fact that can 
be assumed for the purposes of this problem. 


204 ALGEBRAIC STRUCTURE 


Show that 


R(iw) = { Likaee) et dt, 
_ — h(=1) 


iX(iw) = { 7 ; 


— @ 


Jen dt, 


where h denotes the complex conjugate of h. Further, show that R+iX 
corresponds to a projection if and only if X(iw)=0 and R(iw)=1 or O for 
all w € (— 00,00). Then conclude that this is impossible under the assumption 
that h(t)=0 for t<0, except for the trivial case h(t)=0. See Example 8, 
Section 5.19.] 


2. Let P be a projection on a linear space X. Show that the range of P is given by 
AP) = {xe X: Px = x}. 


3. Let X = L,[—2,2] and show that 


(Px)(t) = * KGOMG) at. 
where " 


1 n=+10 — 
K(t,t) = — y ein(t—t) 


27 n=-—10 


represents a projection on X. 
4. Let X¥ = C[0,T] and define P by 


(Px)(t) = x(0)(1 — 2), for0<1t<T. 
Show that P is a projection. 


5. Complete the argument of Example 2. 

6. Let P(t) be an n xm projection matrix defined for —o <t< o, that is, 
[P(t)]? = P(t). Assume that the coefficients in P are C' functions and consider 
the matrix differential equation 

Xx =P 'P =P PX, 


where P’ = dP/dt. Let X(t) be a solution of the above equation. Show that 
P(t) X(t) is also a solution of this equation. [Hint: Show that PP’P = 0.] 


12, LINEAR FUNCTIONALS AND THE ALGEBRAIC CONJUGATE 
OF A LINEAR SPACE 


There is a kind of linear transformation that is so important and used so 
often that it is given a special name; namely, a linear functional. 


4.12.1 DEFINITION. Let X be a linear space over the scalar field F. A linear 
transformation / of X into its scalar field F is said to be a linear functional on X. 


4.12. LINEAR FUNCTIONALS AND THE ALGEBRAIC CONJUGATE 205 


We use some special notation for linear functionals. Instead of /(x), we write 
(xl); and sometimes we write <-, /> instead of I. 


EXAMPLE 1. Suppose that we have a tank Q filled with a substance whose 
mass per unit volume at the point (x,y,z) is denoted p(x,y,z). Assume that p is a 
point in the linear space X made up of all real-valued continuous functions defined 
throughout Q. The total mass contained in the tank 1s given by 


M = <p,l) = J dv, 


where dv denotes a differential volume. Obviously, /, the operation of mapping 
the density p into total mass M, is a linear functional. J 


One of the main applications of linear functionals is their use in the character- 
ization of subsets of linear spaces. For example, the null space of a linear func- 
tional, that is, W(/) = {xe X: <x,]) = 0}, is a linear subspace of X. If J is a 
nontrivial (J 0) linear functional, then ./(/) has some interesting properties. 
In particular, (/) is a very large proper linear subspace of X. In fact, we will now 
show that (/) is maximal in the following sense: If A is any linear subspace of 
X such that W(/) < A and W(l) # A, then A = X. Another way to say the same 
(hing is that the co-dimension of (J) is one. 


4.12.2 THEOREM. Let / be a nontrivial linear functional on a linear space X 
and let M be an algebraic complement of the null space N (1). Then dim M = 1. 


Moreover, if A is any linear subspace of X with W(]) <A and N(l)# A, then 
A= X, 


Proof: Let x, bea point in X such that I(x.) 4 0. Then let x be any point in 
V and let 


Z=x x) x 
I(x) - 
It follows that /(z) = 0 and 
X=Z+ Kx) x 
K(X) = 


So each point x € X can be expressed as the sum of a point in W(J) and a point in 
(he one-dimensional linear subspace M spanned by x,.. Therefore, X = W(/)+ M 
und W(l) 0 M= {0}. Hence, M is an algebraic complement of /W(/) and 
codim W(/) = 1. Since W(/) 4 A, we can also choose x, so that it is in A — V(I). 
The rest of the theorem follows immediately. J 


Since the intersection of linear subspaces is a linear subspace, we see that if 
{(/,} 1s a set of linear functionals on X, then S = (),./(J,) is a linear subspace of X. 
Thus we can use linear functionals and sets of linear functionals to characterize 
linear subspaces. 


206 ALGEBRAIC STRUCTURE 


We can also use linear functionals to introduce a generalized plane concept. Re- 
call that in R° a plane is characterized as being the set of all points x = (x,,X2,X3) 
such that 


1X, + H2X2 + K3X%3 = a, (4.12.1) 


where @,,%,,%3,, and a are real numbers and (a,,a,,«03) # (0,0,0). Now the 
left side of (4.12.1) defines a linear functional, call it J, on R°®. So this plane can also 
be viewed as the set of all points in R° such that <x,/) = «. Of course, different 
linear functionals / and different constants « can yield different planes in R*. In 
general, we make the following definition. 


4.12.3 DEFINITION. Let X be a linear space over a scalar field F. Given a 
\near functional / on X and a scalar a, the set 


H= {xe XxX: ¢x,l) =a} 


is said to be the Ayperplane in X determined by / and «. When F= R, the sets 
{xe X: <x] >a}, {x EX: (xD > a}, {x Ee X: (x, <a}, and {xe X: <x, <a} 
are referred to as half-spaces determined by the hyperplane H. A set Ac X 1s 
said to be on one side of the hyperplane H if A is contained in one of the half-spaces. 
The set A lies strictly on one side of H if in addition A does not intersect H. 


EXAMPLE 2. Consider the real space X¥ = L,[0,7], and let /,, /, be the linear 
functionals on X defined by 


(xsl =] yOx) dt, 


T 
xsl) = | va(t)x(0) a 


where y, and y, are given points in L,[0,7]. The sets {xe X: ¢x,/,> = 1.2}, 
{xe X: (x,l],> = 0.8}, {xe X: ¢x,L,> = 1.1}, and {x € X: <x,1],> = 0.9} are hyper- 
planes in X. The set 


A= {xe X:0.8 < ¢x,],> < 1.2} 9 {xe X: 0.9 < <x), < 1.1} 


is the intersection of four half-spaces or—if you will—the intersection of two 
“slabs.” In application, it might be that x was some input value position as a 
function of time and the numbers (x,/,) and (x,/,) were final temperatures 
reached at time t = T at two points in the system. The set A would be the set of all 
inputs x that yield final temperatures sufficiently close to certain “ideal” 
values. J 


So we can use linear functionals to characterize certain kinds of sets in a linear 
space. The question now arises, how rich is our supply of linear functionals? I’o1 
example, a linear functional / on X characterizes a linear subspace of X with 
co-dimension one, namely, (/). But what about the other way around? Suppose 
that M is any linear subspace of XY with co-dimension one. Does there exist u 


4.12. LINEAR FUNCTIONALS AND THE ALGEBRAIC CONJUGATE 207 


linear functional / on X such that W(/) = M? Fortunately, the answer is yes. The 
sume question also has a geometric facet. 

A hyperplane can be viewed as a translated linear subspace of co-dimension 
one. Indeed, consider the hyperplane H= {xe X: (x,/)=a} and let x, be an 
urbitrary point in H. It is easily shown that the set M = {xe X:x+x,¢€H} isa 
linear subspace of co-dimension one. Indeed M is the null space of the linear func- 
(ional /, that is, W(/) = M. The situation is illustrated in Figure 4.12.1. Again one 


Figure 4.12.1. 


cun wonder about the supply of linear functionals on X. Suppose we are givena 
linear subspace M of co-dimension one anda point x). Let H = {xe X: x —x, eM}. 
ldoes there exist a linear functional / and a scalar « such that <x,/J) =a if and 
only if x € H? The answer here is also yes. 

The next two theorems contain the answers to the preceding questions. 


4.12.4 THEOREM. Let M be a linear subspace of a linear space X, and let ly 
he a linear functional defined on M. Then there exists a linear functional | defined 
on X such that lis an extension of ly, that is, Lis defined on X and <x,l> = <x,ly)> 
forallxe M. 


4.12.5 THEOREM. Let M, be a proper subspace of a linear space X, and let 
\) be a point in X — Moy. Then there exists a linear functional | on X such that 
wW,l> = 0 if xe My and (x),]> = 1. 


Proof of Theorem 4.12.4 is contained in the solution of Exercise 5, Section 3 
So assuming Theorem 4.12.4, let us go on to the proof of Theorem 4.12.5. Let 
| vo] denote the one-dimensional linear subspace of X spanned by x. Let 
Mt = M, + [xo]. Since My - [x,] = {0}, a point x e M can be expressed uniquely 
in the form x = x’ + ax9, where x’ € M, and « is a scalar. Let /,, be the linear 
(unctional defined on M such that <x’ + ax9,/y> =a. Clearly ¢x,/,,> = 0 for all 
ve My and <x9,/y> = 1. Then using Theorem 4.12.4 extend /, to X. This exten- 
sion, call it /, is a linear functional on X of the desired form. J 


208 ALGEBRAIC STRUCTURE 


Let X‘ denote the set of all linear functionals defined on a linear space X. 
We have just seen that in a meaningful sense there are many linear functionals in 
X‘, It also happens—with surprisingly widespread ramifications—that X/ can be 
viewed as a linear space over the same scalar field F. If /, and /, are in X%, then 
I, + 1, is defined by 


Xl +L) = <x > +L), for all xe X. 
Similarly, we define scalar multiplication by 
«x,al> = acx,l), for all x in X. 
The origin in X/ is the zero functional. Moreover, one has 
«x, -D) = —<x,)]), for all x in X. 


We refer to the linear space X/ as the algebraic conjugate of X. 


EXERCISES 
1. Show that if <x,x’> = 0 for all x’ e XY, then x = 0. 


2. Let B be a Hamel basis for a finite-dimensional space X. If B= {x,,...,x,}, 
then we know that each x € XY can uniquely be written in the form 


X= OX, + °° +4, 


where the a@’s are scalars. Show that if z’ is any point in X‘, there exist fixed 
scalars B,,..., 6, such that 


{x,z7)> =4,8, + °°: +4,B,, forallxe X. 


3. Show that if X is finite dimensional, then dim(X“) = dim(X). (In the infinite- 
dimensional case this is not true.) 


4. Let B, be a Hamel basis for a linear space YX. If x, is an element of B,, let 
x,'€ X! be such that <x,,x,’> = 1 and <x,,x,'> =0, where x, is any other 
element of B,. Show that the mapping T of X into itself defined by 

T(x) = (X,% 9, Xe, Ht + CX q, Xe, 


in a projection onto the subspace spanned by the set {x,,,%,,,---,%,}- What 
is the null space of T? 


13. TRANSPOSE OF A LINEAR TRANSFORMATION 


Suppose that L: X > Y is a linear transformation and let ¥4 and Y* denote 
the algebraic conjugates of X and Y, respectively. Let y’ be any point in Y% and 
consider a hyperplane 


Hy={yeY: <yy'> = a}. 


4.13. TRANSPOSE OF A LINEAR TRANSFORMATION 209 


One is often interested, either directly or indirectly, in the inverse image of Hy 
under the linear mapping L, that is, 


L (Ay) ={xe X: (Ley) =a). 


The situation is sketched in Figure 4.13.1. The first thing to note is that the ex- 


L*'(Hy) 


Figure 4.13.1. 


pression <L(-),y’> represents the composition y’L of the linear transformation 
/,and y’. But then y’Z is a linear mapping of X into the scalar field F; therefore, 
the expression <L(-),y’> represents a linear functional on X. Denoting this func- 
tional by x’, we have 


«x,x') = <Lx,y’>, for all x Ee X. 


As long as x’ #0, L~‘(A)y is the hyperplane Hy = {xe X: <x, x’) =a}. If 
v’ =0 and «= 0, then L~‘(Hy) = X. If x’ = 0 and « £0, then L7 '(Hy) is empty. 
(Why ?) 


EXAMPLE |. Let X = R*, Y= R°*, and L be the linear transformation 
iepresented by the following matrix equation 


E-E Je 


Hy={ye VY: yy, + y2 + y3 = 10}. 


l‘urther, let 


We claim that the inverse image of Hy under L is the hyperplane 
Hy = {xe X: 8x, + 6x, = 10}. 
Of course what we need to do next is find an orderly way to determine the 


\’ that is associated with a y’. First note that implicit in the foregoing discussion 
is & mapping of Y/ into X%. Indeed, given any y’ e€ Y/, its image under this new 


210 ALGEBRAIC STRUCTURE 
mapping is x’ e X‘, where x’ is the composition y’L. Let us denote this mapping 
by L’, that is, x‘ = L’y’. In other words, L’ is defined so that 

<x,LTy'y = <Lx,y’), (4.13.1) 


for all x € X and all y’ e YY. It is easy to show that L’ is a linear mapping. Indeed 
if x; = Ly,’ for i= 1, 2, then 


COX ey SX es FS OL) EC > 
= CLx,y,' » oT <LxX,2' » = CLx, yy’ + y2 we 


Hence (x,' + x2’) = L(y,’ + y,’). Similarly one has, ox’ = L'(ay’). Figure 4.13.2 
illustrates (4.13.1). Let us now formalize what has just been said in a definition. 


is 
¢ 


Figure 4.13.2. 


® 
Qs 


4.13.1 DEFINITION. Let L: X— Y be a linear transformation. A_ linear 
transformation L?: YS —> XF such that 


(x,L"y’) = <Lx,y'), 
for all x e X¥ and all y’ € Y% is said to be the (algebraic) transpose of L. 


Of course, the development leading up to the definition above shows that every 
L has a unique transpose, L’. 
The next example shows why we refer to L’ as the transpose of L. 


EXAMPLE 2. (This example is a continuation of Example 1.) Since ¥ = R’ 
and Y = R° and since the dimension of a linear space and its algebraic conjugate 
are the same in the finite-dimensional case (Exercise 3, Section 12), Xf and Y/ are, 
respectively, two- and three-dimensional real linear spaces. The transpose i. 
then, is a linear transformation of a three- into a two-dimensional linear space. 
In the spirit of Section 8, we can represent L’ using a real matrix with two rows 
and three columns. Of course, the actual 2 x 3 matrix depends on the bases 


4.13. TRANSPOSE OF A LINEAR TRANSFORMATION 211 


chosen for X¥/ and Y/‘. We choose two very special ones. Let the basis of X be 
{n,',n2'}, where the linear functional n,’ on X is defined by <x,n;'> = x;, i= 1, 2. 
We leave it to the reader to show that {n,’,y,'} is linearly independent and 
spans X/’. Similarly, let the basis of Y/ be {v,’,v2’,v3'} where <y,v,> = y;, 
i= 1,2,3. Then for any x'e Xf and y’e YS, we have x’ = a,n,' + a2,’ and 
y' = Biv,’ + B2v2' + B3v;3'. It then follows (after a little thought) that the matrix 


representation of L’ is 
gee ai 2 LF. 
at (2 4 O17} 
: B3 


Carefully note that the above matrix is the transpose of the matrix in Example 1. 
Needless to say, other bases could be chosen for X4 and Y/ such that the matrix 
representing L’ would not be the transpose of the matrix representing L. Jj 


The concepts of the transpose and the inverse of a linear operator can be put 
together to get further information about each. We note the following facts. The 
proofs are outlined in the exercises. 


4.13.2 THEOREM. Let L bealinear transformation of X into Y. Then&(L) = Y 
if and only if L" is one-to-one. 


4.13.3 THEOREM. Let L be a linear transformation of X into Y. Then L is 
one-to-one if L™’ maps Y‘ onto X!. 


Carefully note the lack of symmetry between these two theorems. In Theorem 
4.13.2 the statement is “‘if and only if’? whereas in Theorem 4.13.3 it is merely 
““if.’? There are linear spaces X for which the latter theorem becomes an “if and 
only if’’ statement, among these linear spaces are all those of finite dimension. 


EXERCISES 


The first four exercises are concerned with the proof of Theorems 4.13.2 and 
4.13.3. 


|. Show that if Z is a linear transformation of X into Y, then 
N(L) = {x € X: <x,x’> = 0 for all x’ € A(L’)}. 


[Hint: Denoting the set on the right by 4A, first show that W(L) c A as follows: 
Let x be any point in (ZL) and show that x € A by using 


<Lx,y') = ¢x,L"y’), 


for all xe X and y’e Y4. Next show that (L) > A as follows: Let x be any 
point in A and show that <Lx,y’> = 0, for all y’e Y/. Then use the result of 
Exercise |, Section 12 to show that x e V(L).] 


212 ALGEBRAIC STRUCTURE 


2. Prove Theorem 4.13.3. [Hint: It follows from the foregoing exercise that L is 
one-to-one if and only if A = {0}. Therefore, show that® @(L') = X implies 
that A = {0}. Use Exercise 1, Section 12 again.] 


3. Let L be a linear transformation of X into Y. Show that 
AL) = {ye Y:<y,y’> =0 for all py’ e W(L*)}. 


[Hint: Denote the set on the right by B. First show that @(L) c B as follows: 
Let y be any point in &(L). Letting x denote a pre-image of y, use 


<Lx,y"> = ¢x,L"y') 


to conclude that <y,y’> = 0, for all y’e W(L'). Next show that A(L) > B by 
arguing by contradiction. That is, assume that there exists a point yo € B which 
is not in @(L). Use Theorem 4.12.5 to show there exists a functional y,’ e Y/ 
such that <y,y.’> =0, for all ye A(L) and <yo,yo'> = 1. Then show that 
Yo € V(L'). Since yo € B, it then follows that < y9,¥o'> = 0, which is a contra- 
diction.] 


4. Prove Theorem 4.13.2. [Hint: Use the result of Exercise 3. Show that B= Y if 
and only if W(L’7) = {0}. First assume W(L") = {0}. It follows almost trivially 
that B= Y. Next assume that B= Y. Let y’ be any point W(L"). Since, by 
assumption B= Y, <y,y9'> = 0, for all ye Y.] 


5. Generalize Example 2 to linear mappings of the real linear spaces R" into R”. 
That is assume that L: R" > R” is given by the matrix equation 


Yi Gy, +++ Qin | 1X1 

Ym Ami +++ Ann Xn 
Show that with respect to appropriate bases on R and R™, L' can be repre- 
sented by the matrix operator 


ee 


6. What happens in Exercise 5 for linear mappings of the complex linear spaces 
C" and C™ ? 


SUGGESTED REFERENCES 


Halmos [4]. 
Indritz [1]. 
Nering [1]. 


© The reason that A(LT) = X°% is not a necessary condition for L to be one-to-one is that in cert 
infinite-dimensional linear spaces X there exist proper ‘total’? subspaces M of X/. That 1s, A/ 
has the property that if <x,x’> = 0 for all x’e M then v0. One could have .4(L')  Mwhenl 
IS phere one Roughly speaking, M contains a “‘rich supply” of linear functionals without bemyg 
all of X’. 


Combined 
Topological 
and 
Algebraic 
Structure 


1. 


Introduction 


Part A Banach Spaces 


2. 


Se Oh 


10. 
11. 


Definitions 

Examples of Normed Linear Spaces 
Sequences and Series 

Linear Subspaces 

Continuous Linear Transformations 
Inverses and Continuous Inverses 
Operator Topologies 

Equivalence of Normed Linear 


Spaces 
Finite-Dimensional Spaces 


Normed Conjugate Space and 
Conjugate Operator 


Part B- Hilbert Spaces 


12. 


13. 
14. 
15. 


16. 
17: 


18. 
19. 


20. 


Zl 


Inner Product Spaces and Hilbert 
Spaces 


Examples 
Orthogonality 


Orthogonal Complements and 
the Projection Theorem 


Orthogonal Projections 


Orthonormal Sets and Bases: 
Generalized Fourier Series 


Examples of Orthonormal Bases 


Unitary Operators and Equivalent 
Inner Product Spaces 


Sums and Direct Sums of Hilbert 
Spaces 


Continuous Linear Functionals 


Part C Special Operators 


22; 
2: 
24. 
2 


The Adjoint Operator 

Normal and Self-Adjoint Operators 
Compact Operators 

Foundations of Quantum Mechanics 


214 


215 
215 
218 
224 
229 
234 
243 
247 


257 
264 


271 


272 


272 
278 
282 


292 
300 


305 
322 


33] 


340 
344 


JozZ 
352 
367 
378 
388 


1. INTRODUCTION 


The concept of a continuous mapping was introduced in Chapter 3 
(Section 5). This is a topological concept. A linear mapping was defined in 
Chapter 4 (Section 3). This is an algebraic concept. Our purpose in this chapter 1s 
to introduce a new class of spaces which, among other things, allows one to com- 
bine the two concepts of continuity and linearity. That is, we shall introduce the 
important concept of a continuous linear transformation. 

The underlying space which allows one to combine the concepts of continuity 
and linearity is the normed linear space. This space is formed by suitably combining, 
by means of a norm, the topological structure of metric spaces and the algebraic 
structure of linear spaces. This combination is performed in such a manner that the 
two structures, topological and algebraic, are compatible. That is, addition and 
scalar multiplication are continuous, and the implied metric has a certain algebraic 
structure. 

Part A of this chapter is devoted to a discussion of a number of elementary 
facts about normed linear spaces and Banach spaces. We ask the reader to note 
carefully the geometric nature of the structure of these spaces. 

Part B of this chapter treats inner product spaces and Hilbert spaces. These 
are normed linear spaces with some very important additional structure. In particuliu 
an inner product and the concept of orthogonality are present. Because of this 
additional structure, the geometry of Hilbert spaces is relatively simple to unde! 
stand. Indeed, it is more or less a generalization of Euclidean geometry to infinite 
dimensional spaces. 

It is difficult to overemphasize the importance of Hilbert spaces. A truly 
amazing number of problems in engineering and science can be fruitfully treated 
with geometric methods in Hilbert spaces. We shall illustrate some of them in the 
examples of this and the following chapters. 


V1A4 


Part A 


Banach 
Spaces 


2 DEFINITIONS 


Let x = (x,,x,) denote a point or vector in the real Euclidean plane. The 
l.uclidean length of this vector is given by 


xl] = Gay? + 22°). 
The Euclidean distance between two points x = (x,,x,) and y =(y,,y2) IS given 
by |x — yl], that is, 


Ix — yl = C04 — yi)? + 2 — 2)". (5.2.1) 


ln this way one can view ||x|| as a real-valued function defined on the real Euclidean 
plane. Moreover, this function generates a distance function or metric by means of 
(5.2.1). This function on the plane is an example, in fact, the archetypal example, of 
norm on a linear space. 

It is our desire to extend the foregoing concepts of length and distance to 
linear spaces (other than the plane) which leads us to seek a conception of **norm”’ 
which incorporates the essential features of length and distance. Experience has 
shown that the following definition is what we seek: 


5.2.1 DEFINITION. A real-valued function ||x|| defined on a linear space _X, 
where x € XY, 1s said to be a norm on X if 


(N1) ||x]] =O (Positivity), 

(N2) |x + yll < Ill] + lvl (Triangle inequality), 

(N3) ||xx|| = |x| |x|], « an arbitrary scalar (Homogeneity), 
(N4) ||x|| = O0if and only if x =0 (Positive definiteness), 


where x and y are arbitrary points in X. The number ||x|| 1s referred to as the norm 
of x, or length of x. 


Axiom (N1) says that the length of a vector is nonnegative, and Axiom (N4) 
says that only the origin (or zero vector) has length 0. Axiom (N2) 1s a type of 
(riangle inequality, and, as we shall see, it 1s related to the triangle inequality for a 
metric. The Homogeneity Axiom (N3) says that scalar multiplication results in a 
stretching (or shrinking) of the length of x. by a factor |g]. 


215 


216 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


5.2.2 DEFINITION. A normed linear space is a pair (X, ||: ||), where X is a 
linear space and ||- || is a norm defined on X. When no confusion is likely we will 
denote (CX, || - ||) by X. 


We have asserted that a normed linear space combines the topological structure 
of a metric space with the algebraic structure of a linear space. Although the asserted 
algebraic structure is evident from the definition, the topological structure requires 
a few remarks. 

We claim that the function 


d(x,y) = |x — yll, 


where x and y are two points in the normed linear space (CX, ||-||), 1s a metric defined 
on X. Let x, y, and z be arbitrary points in X. Axiom (N1) asserts that ||x — y|| > 0 
and Axiom (N4) implies that ||x — y|| =0 when x= y. Hence ||x — yj satisfies 
Axiom (M1) in the definition of a metric in Section 3.2. The “only if” part of 
Axiom (N4) shows that if ||x — y|| =0, then x = y, which is Axiom (M2) for a 
metric. Axiom (M3) for a metric follows from (N3) by setting « = —1. Finally 
from (N2) one gets 


Ix — yl] = lx -—z+z—yl] < lx — 2] + llz — yl, 


which is the triangle inequality (M4) for a metric. Thus, ||x — y|| is indeed a metric 
on X. Since ||x|| = ||x — O|] we see that the norm of a vector is equal to its distance 
from the origin. 

The norm, then, generates a metric on X. We shall show in the exercises that 
there are metrics that are not generated by norms in the sense defined above. The 
concept of a norm does restrict the class of metrics. 

Since a normed linear space is a metric space, one can ask questions about 
continuity and convergence utilizing the mathematical apparatus developed in 
Chapter 3. A convention which is universally accepted is the following: 


Whenever one discusses the topological (or metrical) properties of a normed linear 
space (X, ||- ||), the metric is defined in terms of the given norm by d(x,y) = ||x — yl. 


It is important to emphasize that, just as there are many metrics which can he 
defined on a given set, there are also many norms which can be defined on a given 
linear space X. Each norm gives a new normed linear space, and each norm defines 
a different metric on XY. Different norms may define equivalent metrics, in which 
case, it is reasonable to say that the norms are equivalent. This concept is discussed 
again in Section 9. 

Once and for all we answer a few elementary questions of continuity. The norm 
considered as a mapping of the normed linear space X into the reals R, 1s continuous. 
The addition operation in X (considered as a mapping of X x X into X) and the 
scalar multiplication operation (considered as a mapping of F x X into X, where 
F is the scalar field) are continuous. We leave the proofs of these statements is 
exercises. 


5.2. DEFINITIONS 217 


One metric space concept is worthy of special note at this point, and that is 


completeness. It turns out that normed linear spaces with this property play a 
crucial role, and that prompts the following definition. 


5.2.3 DEFINITION. A normed linear space is said to be a Banach space if it is 


complete. 


EXERCISES 


l. 


6. 


Show that the norm ||-|| considered as a mapping of a normed linear space X 
into the reals is continuous. [Hint: Use the triangle inequality.] 


. Show that addition and scalar multiplication are continuous. 
. A function ||x|| on a linear space XY that satisfies conditions (N1), (N2), and 


(N3) is said to be a pseudonorm. Let ||x|| be a pseudonorm on X. Show that 
p(x,y) = ||x — y|| is a pseudometric on_X. 


. Characterize all possible norms on the real line R, where R is considered a real 


linear space. On the complex plane C, where C is considered as a complex 
linear space. Show that C may have other norms when it is considered as a real 
linear space. 


. Let CX, ||-|]) be a normed linear space and let S, = {x: ||x|| =r} where r > 0. 


Assume that X # {0}. Show that CX, |||) is a Banach space if and only if the 
metric space {S,, ||-||} is complete for some r > 0. 


Let p satisfy 0 <p <1 and consider the space L,[0,1] of all functions with 
1 
Ix =f [x(OIP dt < 0. 
0 


Show that ||x|| is not a norm on L,[0,1]. Show that d(x,y) = ||x — y|| is a 
metric on L,[0,1]. [Hint: Note that if0 < «<1, thena<a?<1.] 


. Define 


OAS) = suptlf(I: It] <n}, 
pif) = min(1,0,(f)), 


I = 2") 


where fe C(— 00,00). 

(a) Show that o,(f) is a pseudonorm on C(—©,00). 

(b) Show that p,(/) and || || are not norms. 

(c) Show that d(f,g) = ||f— ll is a metric on C(— 00,00). 

(d) Show that d(f,,f) ~0 as n-— oo if and only if f(t) f(t) uniformly on 
compact sets in —0o <?t< oo. 


. (Generalization on Exercises 6 and 7.) Let X be a linear space and let ||x|| be a 


real-valued function defined on XY. Show that d(x,y) = ||x — y|| 1s a metric if 
and only if ||x|| satisfies (N1), (N2), (N4), and ||x|| = || — || for all x in X. 


218 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


9. Let (X,||:|]) be a normed linear space and let {x,} be a sequence in X with 
x =lim,.,,. X,- Assume that ||x, — y|| < @ for all n. Show that ||x — y|| <a. 


10. Show that a Hamel basis for a Banach space is either finite or uncountably 
infinite. (Completeness is important for this. Use Exercise 17, Section 3.13.) 


11. Let X denote the collection of all sequences x = (x,,x2,...) of complex 
numbers and define 


Ix = J 2-* min lx, 


(a) Is ||-|| a norm on X? 
(b) Does ||x — y|| define a metric on X? If so, explain the meaning of 
jx — x|| > 0 asn— oo. 
12. Employ a normed linear space in the construction of a mathematical model. 
13. Give an example of a metric d on a linear space such that d(x,0) is not a norm. 
14. A real-valued function f defined on a linear space X is said to be convex if 


S(ax + By) < af(x) + Bf(y) for all x, ye X and all real numbers « and 7? 
such that 0 < a, B < 1, and a + B = 1. Show that a norm its a convex function. 


3. EXAMPLES OF NORMED LINEAR SPACES 


Many of the examples of metric spaces presented in Chapter 3 are also normed 
linear spaces, that is, the metric is generated by a norm. Some of these will be dis- 
cussed in the exercises. In many of the examples below, we shall leave the proof that 
a certain function is a norm as an exercise. 


EXAMPLE 1. Let x = (x,,...,x,) be a point in R”. We define a norm on R" by 


n 1/p 
ist, = [> bar] (5.3.1a) 
for | <p < o, and 
[ag = maxf]y] bs. (5.3.1b) 


for p = oo. It is easily seen that for each 1 < p < ©, ||x||, isa norm on R”. The only 
difficult step is the proof of the triangle inequality, but this is a direct consequence 
of the Minkowski Inequality. (See Appendix A.) 

Equation (5.3.1) also defines a norm on the complex linear space C”. 

It follows (see Exercise 8, Section 3.13) that (R”,||-||,) and (C",||-||,) are Banach 
spaces, forl<p<o. J 


EXAMPLE 2. Consider the set /,, 1 <p < 0, of all sequences x = (x,,X2,...) 
of scalars with the property that 


00 
> |x,|? < 0: 
jm 


5.3. EXAMPLES OF NORMED LINEAR SPACES 219 


It was shown in Chapter 4 that /, is a linear space. If we define 


00 1/p 
Ix, = » oa | (5.3.2) 


i=1 
we see that ||x||, is a norm on /,. The triangle inequality for ||x||, follows from the 
Minkowski Inequality for infinite sums. (See Appendix A.) 
It was shown in Section 3.13 that /, with the metric determined from (5.3.2) is 
complete; therefore, the space (/,, ||-||,) is a Banach space. The norm ||x||, is referred 
to as the usual norm onl,, l1<p<o. Jf 


EXAMPLE 3. We can define a norm ||x||,, on the space /,, of all bounded 
sequences x = (x,,X,,...) of scalars as 


|x||. = sup{|x;|: 1 <i< oo}. 


We leave it as an exercise to show that (/,,,|l:||,,) is a Banach space. J 


EXAMPLE 4. Let (7,d) be a metric space and let X = BC(T,R) denote the 
space of all real-valued, bounded, continuous functions defined on T. Thus x e X¥ 
(and only if x(t) is a real-valued, continuous function defined on T with 

IXlo = sup{|x(t)|: te T} 
being finite. It is easily shown that X is a linear space. We ask the reader to show 
(hat ||-||,, is a norm on X. This is referred to as the sup-norm. 

It is shown in Example 7, Section 3.13, that BC(T,R) is complete; therefore, it is 
i Banach space. 

Several commonly employed cases of this example occur when T is an interval 
on the real line, such that T = [0,1], T = [0,00), or T = (— &,00). (In this case, dis 
(he usual metric on T.) Other examples occur when T = R” or T = C" with one of the 
metrics given in Example 1. J 


EXAMPLE 5. Let (7,d) be a compact metric space. Then every real-valued 
vontinuous function defined on T is bounded, by Theorem 3.17.21. In other words, 
RC(T,R) is precisely C(T,R), the space of real-valued, continuous functions defined 
on T. It follows from that last example that (C(7,R),||-||,,) is a Banach space. Jj 


EXAMPLE 6. Consider the Lebesgue space L,[0,1], 1 < p < 00, consisting of 
ull scalar-valued measurable functions x(t) defined for 0 < ¢ < 1 such that 


isl =[f xeon ar] *” 


is finite. The Minkowski Inequality for integrals (see Appendix A) shows that ||x||, 
nutisfies the triangle inequality. The other properties for a norm are easily verified 
und we see that ||x||, is a norm’ and (L,,||-||,) is a normed linear space. It follows 
(rom Theorem D.11.2 that this is a Banach space. 


' Strictly speaking, ||x||, is only a pseudonorm. However, we can change it to a norm by introducing 
Ww new equality on L,[0,1] (see Example 10, Section 3.3). That is, we say that x = 0 (in the new 
cquality) if |lx||,—0. It follows from Exercise 3, Section D.7, that x = 0 (in the new equality) if 
and only if x(t) = 0 almost everywhere. 


220 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


We could replace [0,1] with any interval (finite or infinite) J or, more generally, 
by any measurable set A. In the latter case the norm would be defined by 


|x|, = ({. |x(t)|? it a 


Even more generally, if (Q,.@,u) is any measure space, where yp is a positive 
measure, and L,(®) denotes the collection of all scalar-valued measurable functions 
x defined on Q with 


ixl,=(f eo! u(do)) 


finite, then ||x||, is a norm on L,(Q) and (L,(Q),||-||,) is a Banach space. Needless to 
say, an extremely important example of this Banach space occurs when (Q,.@,):) 
is a probability space (Q,¥F,P) (see Appendix E) and the x’s are random variables 
on (Q,F,P). 

In any event, the norm ||x||, is referred to as the usual norm on L,. There arc, 
of course, other norms that can be placed on L,. J 


EXAMPLE 7. The Lebesgue space L,, is defined as follows: Let J be an interval 
(finite or infinite) and let L, (7) denote the collection of all scalar-valued measurable 
functions x defined on J with the property that there is a B, 0 < B < o, such that 


|x(t)| < B (almost everywhere on J). 
One can define a norm on L,,(/) by 
|x|. = ess sup{|x(t)|: te I} 
= inf{B: |x(t)| < B almost everywhere on J}. 


The norm ||x||,, 1s called the essential supremum of the function x. It is referred to 
as the usual norm on L,, . This norm is sometimes referred to (somewhat inaccurate- 
ly) as the sup norm. It is shown in Theorem D.11.4 that (L,(/),||'||,,) is a Banach 
space. J 


EXAMPLE 8. Let J = [a,b] be a bounded interval and let 


Pia=h< ti" <= 4=9 


be a partition of J. Any scalar-valued function f, for which 


V(f) =sup, > | f(t) —f(i,_ 1): P is a partition of I (5.3.4) 
i=1 


is finite, is said to be a function of bounded variation and V(f) is said to be the total 
variation of f. One can define a norm on the collection BV (/) of all functions of 
bounded variation by setting 


IF =lflat+) + VOY), (5.3.5) 


where f(a+) = lim,.., f(t). We leave it as an exercise to show that this is indeed 
anorm. J 


5.3. EXAMPLES OF NORMED LINEAR SPACES 221 


EXAMPLE 9. Let 2 be a nonempty set and let .@ be a o-algebra of sets in Q. 
Let X denote the collection of all real-valued measures p on Q that can be decom- 
posed into its positive and negative parts, uw = u* — uw, where p* and pare posi- 
tive measures (see Example 2, Section D.10). Let Y denote the sub-collection 
of those measures py for which p*(Q) < oo and pp (Q) < oo. A norm is given on Y by 
setting 


lull =h7Q)+n7 OQ). | 


EXAMPLE 10. (THE HOLDER SPACES.) Let «a satisfy 0<a< 1 and define 
C*[0,1] to the space of all scalar-valued functions x that satisfy 


|x(t) — x(s)| < K|t — s|* (0<ts5< 1) (5.3.6) 
for some finite K. It follows that if x satisfies (5.3.6), then x is continuous. Let 
N,(x) = inf{K: (5.3.6) is satisfied}. 


For example, if x(t) = cos at, then N,(x) = 2. 
Now define a norm on C*[0,1] by 


Illa = Wl + Nal), 


where ||x||,, 1s the sup norm. This space is discussed further in the exercises. J 


EXAMPLE 11. Let J = (a,b) be an open interval in R and let C"(J) denote the 
collection of all scalar-valued functions defined on J with v continuous derivatives. 
As usual, let C°() =()",C"J/), and let C)*(J) denote those functions in 
C(I) that have compact support? in J. If u(t) is any differentiable function let 
Du = du/dt, D?u = D( Du), and so on. We ask the reader to show that each of the 
following is a norm on Co)*(/). For 1 <p< o,n=0,1,..., and ve C,*(J) let 


a 1/p 
Wutar=[f Spice ar)” 


This also defines a norm on the collection of all functions u in C"(/) for which 
ll“lln,p is finite. Jj 


EXAMPLE 12. This is an extension of Example 11. Replace J by Q, where Q 
isan open set in R”. Define C"(Q), C°(Q), and Cy*(Q) as above, but now in terms of 
partial derivatives. Let « = (a,,...,a,,) be a vector with integral entries «; where 
a, 20. Let |x| = > 7, «,;. Define the differential operator D* by 


altly 


Du = ——_——— 
a s e e a ? 
OX Ox 


“A function u is said to have compact support if there exists a compact set M in J such that 
u(t) « Oinl— M. 


222 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


where u(x) =-u(x,,...,X,,). We ask the reader to show that each of the following is 
a norm on C)°(Q). For 1 < p <0 and we Cy*(Q) let 


lull, p = if >» |D*u(x)|? ax)” 


al<n 


These spaces are used in generalized function theory or distribution theory. J 


EXAMPLE 13. Let X denote the linear space L,[0,7] with the usual norm. Let 
Y denote the set of all linear mappings / of X into its scalar field that can be re- 
presented by 


I(x) = 1,(x) = f wox(o dt for allxe X, 


where y € C[0,7], that is, each y determines an /,. By defining addition and scalar 
multiplication in the natural way, we see that Y is a linear space. Furthermore, we 
claim that 


Zyl] = sup |i,(x)| 
eves 


defines a norm on Y. (Does it?) We also remark that |[/,|| can be computed from 
T 1/2 
IN ={f OP a 
0) 


EXERCISES 


1. Show that (/,, ,||-||,,) is a Banach space. 


2. Let co denote all sequences x = (x,,x,,...) of scalars with the property that 
lim x, = 0. 
(a) Show that ||x||,, = sup{|x,|: 1 < i < oo} isa norm on Cp. 
(b) Show that co is a linear subspace of /,,. 
(c) Show that cy is a closed subset of (/,,,||*|| ,)- 


3. (Continuation of Exercise 2.) Let c denote the collection of all convergent 
sequences x = (X,,X2,...) of scalars. 
(a) Show that ||x||, is a norm on c. 
(b) Show that (c,||-||,,) is a closed subset of (/,, ,||-||,.)- 


4. Show that ||-||,, is a norm on BC(T,R) (see Example 4). 


5. Let Co = Co(— 0,00) denote the space of real-valued continuous functions, 
defined for —o <t< oo, with compact support. 
(a) Show that ||x|| = sup{|x(t)|: te R} is a norm on Cy. 
(b) Show that (Co,||:||,,) is not complete. 
(c) Show that Cp is a linear subspace of X = BC(R,R) (see Example 4). 
(d) Is Cy a closed subset of (X,||-||,,.)? 


6. In Example 7, show that the essential supremum defines a norm on L,(/). 
Show that (L,,,||-||,,) is a Banach space. 


7. Let J be a finite interval and 1 <p <p’ < o. Show that L,(7) cL,(/). Are 
these spaces equal? What happens if / is an infinite interval? 


5.3. EXAMPLES OF NORMED LINEAR SPACES 223 


8. (Generalization of Exercise 7.) Let (Q,¥,P) be a probability space. Show that 
L(Q) < LQ) < £,(Q). 

9. This exercise refers to Example 8. 
(a) Show that V(/) is a pseudonorm on BV (J). 
(b) Show that (5.3.5) defines a norm on BV (1). 

10. (Continuation of Exercise 9.) Relabel V(/) = V(f; a,b). This is the total 
variation of fon the interval [a,b]. For a < t < b, let V(/; a,t) denote the total 
variation of f on the interval [a,f]. 

(a) Show that V(/; a,t) is an increasing function of ¢. 

(b) Show that g(t) = V(/; a,t) — f(t) is an increasing function of t. 

(c) Show that a function fon J has bounded variation if and only if it can be 
written as the difference of two monotone functions. 

11, (Continuation of Exercise 9.) A function f of bounded variation on J = [a,b] is 
said to be normalized if f(a) =0 and f is continuous from the right, that is, 
f(t + 0) =/f(t) fora < t < b. Let NBV (J) denote the collection of all normalized 
functions of bounded variation on J. Show that V(/) is a norm on NBV (J). 

12. (This exercise refers to Example 10.) Consider the unit ball in C*[0,1], where 
O<a<l. 


B, = {xe C*[0,1]: [lx], < 1. 
Show that B is a compact set in (C[0,1], ||-||,,). [Hint: Use Ascoli’s Theorem. ] 


3. (Continuation of Exercise 12.) Let 0 < a < B < 1. Show that the unit ball B, is 
a compact set in (C*[0,1], ||-||,). [Hint: Note that 


N(x) = sup{|x(t) — x(s)| |¢— s["": t # s}. 


Now we will apply Ascoli’s Theorem to the function of two variables 
y(t,s) =|x(t) — x(s)| |t — s|~*, when x € B,.] 

4. (Orlicz Spaces.*) Let p(t) be a right continuous real-valued function defined for 
t>0 such that p(0)=0, p(t)>0 for t>0 and p(o) =lim,.,, p(t) = ©. 
Let M(u) = {i p(t) dt and assume that there are constants « > 0, uo > 0 such 
that 


up(u) < «aM(u), u> Uo. 


Let g(s) = sup{t: p(t) <s} and Mv) = {\”! g(s) ds and assume that there are 
constants B > 0 and vy, > 0 such that 


vg(v) < BNWv), = vB Vo. 


Let J =[a,b] be a finite interval, and let L,,(1) be the collection of all real- 
valued functions u defined on J for which 


p(u; M)= { Mu) dx < 00. 


* For more details see Krasnoscl'skii and Rutickii [1]. 


224 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Define Ly(/) similarly. For ue L,,(/) let 


b 
lulu = sup||| u(x)v(x) dx}: v € Ly) and p(v; N) < i. 
(a) Show that ||u||, is a norm on Ly (J). 
(b) Show that p(t) = t*~1, for a > 1, satisfies the conditions above. Compute 


q(s), p(u,M), and p(v; N). Compute ||u||,, when « = 2. 

15. Consider the space M” for all n x n matrices with real coefficients. Since M" 
can be identified with R"’ we can use any of the norms in Example 1 on M". 
However, another norm is often-times used. Let ||-|| be a fixed norm on R’. 
Define the norm of a matrix [LZ] in M” by 


ILL] | = sup{||Lx|[: [xl] < 1}. 


(a) Show that |/[Z]|| is a norm on M”. 

(b) Show that |[ZL][M]]] < ||LZ]|| - || [M7], where [L][M] denotes the matrix 
product of the two matrices [LZ] and [M]. 

(c) Let [Z] be the 2 x 2 matrix (7 

has the Euclidean norm ||x||, = (x,? + x,7)'/*. Show that 


\L =4[a+eo+J(a—c)? + 467]. 


(d) What happens if the matrices have complex coefficients? (The concept of 
norm introduced in this exercise is generalized in Section 8.) 


with a>0,c>0 and assume that R? 


16. Let H,, 1 < p < ©, denote the collection of all functions f(z) that are analytic 
for |z| < 1 with the property that 


20 ; 1/p 
Ip = sup {f “scree dal” < co 


Show that || ||, is a norm and that (H,,]|\-||,) is a Banach space. 
17. In Example 1, show that ||x||,, = lim,..,, ||xlp- 


18. In Examples 11 and 12, show that |\ul|,,, defines a norm on C)”(J) and 
Cy~(Q), respectively. 


4. SEQUENCES AND SERIES 


The first offspring of the wedding of topological and algebraic structure is the 
concept of an infinite series. Let (X,||-||) be a normed linear space and let {x,} be a 
sequence in X. Since X is a linear space, one can consider finite sums of the form 


Vn =X, +X_ +01 + Xm = VX, 
n=1 


which generate a new sequence {y,,}, the sequence of partial sums. Since X is also 
a metric space, we can test whether the sequence {y,,} converges to a limit y, which 


5.4. SEQUENCES AND SERIES 225 


means that y,, > y as m— oo if and only if |ly,, — y|| ~0 as m— oo. If this limit 
y exists, we say that the infinite series )\” , x, converges and we write 


y= 2, Xn: 


The infinite series is said to diverge if the sequence of partial sums {y,,} fails to have 
a limit. It is important to emphasize that, just as in the case of metrics, con- 
vergence and divergence depend on the norm used on the underlying linear space X. 

Let us now turn to Banach spaces, which the reader will recall are complete 
normed linear spaces. The story of convergence of infinite series in Banach space 
is simpler because one can use the Cauchy test for convergence without knowing the 
limit of the sequence of partial sums. 


5.4.1 LEMMA. (THE CAUCHY TEST.) Let X be a Banach space. An infinite series 
ye1 X,_ converges in X if and only if for each ¢ > 0 there is an integer N such that 


whenevern>m>N. 


The proof of this lemma is trivial, since it merely states that the series converges 
if and only if the sequence of partial sums is a Cauchy sequence. 

Although the last lemma is useful, a somewhat stronger statement is more 
practical and widely used. A series }'°_, x, in a normed linear space X is said to 
converge absolutely if the series of absolute values )'* , ||x,|| is convergent. 

The series ),~, |\x,|| is a series of real numbers and it converges or diverges 
on the real line. Since the real line 1s complete, it follows from Lemma 5.4.1 that 
the series ) °_, ||x,|| converges if and only if for each ¢ > 0 there is an integer N such 
that 


aX 


|x;l| << 
whenever n > m> N. 


How are absolute convergence and convergence related? In a Banach space we 
have the following answer. 


5.4.2 THEOREM. Let X be a Banach space. If the series )\°., x, is absolutely 
convergent, then it is convergent. 


Proof: It follows from the triangle inequality (N2) that 
2% 


Using the Cauchy Test it follows that the series )\° , x, is convergent. J 


n 
< > [xl 
tm 


226 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


EXAMPLE 1. Let X be the Banach space C [—72,z] with the sup norm ||-||,,. 
The series 


(oe) n 
> — cos nt 
n=1 n! 
is absolutely convergent since 
oe) 3" ao 3" 
y ||— cosnt| =) —=e?-1 
n=1 ' n ! n=! nN: 


Therefore, this series is convergent. J 


This next example shows that in an incomplete space an absolutely convergent 
series need not converge. 


EXAMPLE 2. Let X be the normed linear space C[ —2,x] with the norm 


n 1/2 
isl = (fbx?) 


This space is not complete. The series 


rere) 1 ; 
y —sin nt 
n=1N 
is absolutely convergent; however, it is not convergent. Indeed, in the Banach 
space L,[ —2,72] one has 
fe @) 


1 
» sin nt = y(t), 


n=1 


where 


But y(t), not being continuous, is not in X. However, the series does converge in 
L[—z,7]. |] 


The reader familiar with series of real numbers may recall that the nice thing 
about absolutely convergent series of real numbers is that no matter how we re- 
arrange the terms in the series, the rearranged series (1) converges and (11) the limit is 
the same. Almost the same thing is true in this more general setting. 

We will say that a series )° x, is unconditionally convergent if it and each of its 
rearrangements (i) are convergent and (ii) have the same limit. Recall that a re- 
arrangement is a reordering of the terms in an infinite series. 


5.4.3 THEOREM. Let X be a Banach space. If a series )'., x, is absolutely 
convergent, then it is unconditionally convergent. 


5.4. SEQUENCES AND SERIES 227 


In the case of series of real numbers it can be shown that unconditional con- 


vergence implies absolute convergence. This is not the case for Banach spaces in 
general. Dvoretzky and Rogers [1] have shown that absolute convergence is equiva- 
lent to unconditional convergence if and only if the Banach space is finite dimen- 
sional. 


EXERCISES 


l. 


GN 


Prove Theorem 5.4.3. [Hint: Consider )' x, and )° x,’, where one is a rearrange- 
ment of the other. Given any ¢ > 0, there exists an integer Nsuchthatm >n>N 
implies )7L,, ||x;\| <¢. Choose an integer p such that x, x2,...,Xy are 
contained in the set {x1',x2',X3',..-5Xp }-] 


. Let {y,} be a sequence in a normed linear space (X,||-||). Show that this sequence 


converges if and only if the series ) ~~, x, converges where x, = y, — Y,-1 and 
Yo = 0. 


. (Continuation of Exercise 15, Section 3.) We will let A be ann x n matrix with 


|All =a <1. 

(a) Show that || A” || <a” for n= 1,2,..., 

(b) Let B) = (1+ A)(I—A),B, =(1+ A*)B, and B, = (1+ A?")B,_,, where 
I denotes the identity matrix. Show that || B,—J||< ane 

(c) Let C,= B+ 37%. ,{B, — I}. Show that the infinite series converges abso- 
lutely. 

(d) What is lim C,? 

(ec) Let D, =(1+ A?" (1+ A?" *) --+(1+A). What is lim D,? 


. (Continuation of Exercise 15, Section 3.) Show that 


is well-defined, where A is ann x n matrix. Compute e4 where A is a diagonal 
matrix. 


. (Continuation of Exercise 15, Section 3.) Let A be ann x n matrix with || A|| < 1. 


(a) Show that 
© 4 
log(I — A)= ), - A" 
n=1N 
is well-defined. 
(b) Show that exp[log(J — A)] = J — A. 


. (Continuation of Exercise 4.) We will let f(z) be an entire function, that is, 


F(Z) = Yr 0 Cn2" for all z in the complex plane C. 


(a) Show that for any n X n matrix A, the formula 
0O 
HA) = 2 ce" 
n=0 
defines an n Xn matrix f(A). 
(b) What is f(A) where A is a projection matrix, that is, when A* = A? 


228 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


7, 


10. 


11. 


12: 


(Continuation of Exercise 6.) Consider the power series )' 9 c, A", where A 

is an n X n matrix with complex coefficients and the c, are complex numbers. 

(a) Show that there is a number r, 0 <r< +0, such that this series is abso- 
lutely convergent if ||A|| <r and for any s>,r, there is a matrix A with 
|| A|| = s such that the series is divergent. (The number r is called the radius 
of convergence.) 

(b) Show that r satisfies 


r~' =lim sup2/|c,|. 


n7o 


. (Integral Test.) Let }°*_, x, be an infinite series in a Banach space (X,]||-||) and 


assume that there is a decreasing positive function f(t) defined for 0 < t< a 
such that ||x,|| </(n). Show that if fe L,[0,00), then the series )~, x, is 
absolutely convergent. 


. (Weierstrass M-Test.) Let )°*, x, be an infinite series in a Banach space 


(X,||-]}) and assume that ||x,|| <M. Show that if )°,M, < oo, then the 
series ) , x, is absolutely convergent. 


Consider the norms ||-||,, 1 < p < «, on C[0,7]. 

(a) Show that if ||x, — x||,, ~0asn-— oo, then for every p, 1 < p < oo, one has 
Ix, — xl|, 70 as n> oo. 

(b) Show that if 1<r<s<o, then |x|, < T3~”’"||x||,. 

(c) Show that if 1 <r<s< o and |x, — x||,— 0, then ||x, — x||,- 0. 

(d) Find an example of a sequence {x,} in C[0,7] and a point x in C[O,7] 
with ||x, — xl], > 0 but |x, — x||,, 7 0. 


Let J, (p > 0) be the collection of all sequences x = {x,,x2,...} of scalars with 
the property that 5, |x,|? < ©. 

(a) Show that ||x||,, = sup{|x,|: 7 = 1,2,...} is a norm on /,. 

(b) Show that for 0 < p < o, (J,,||"||.,) is not a Banach space. 


Let A be an x n matrix and let ¢ be a real number. Then e' is given by 


ot" A” 
eA 


(See Exercise 4.) 
(a) Show that x(t) = e'4x, is a solution of 


dx y 
—_ = x 
dt 


that satisfies x(0) = x. 
(b) Let |||, be the Euclidean norm on R". For o real, let Y, denote the 
collection of all continuous functions x: [0,00) > R” such that 


ie. 6) 
{ x(t)|,2 e7 2% dt < 0. 
O 


5.5. LINEAR SUBSPACES 229 


Define a norm ||-||, on Y, by 


IC le = (fx? e°2 dt) 


Show that there is a o such that the mapping T: x) > e'4xy of R" into Y, is 
continuous. 


5. LINEAR SUBSPACES 


A linear subspace M of a normed linear space X is, as one would expect, a 
linear subspace in the algebraic sense equipped with the norm of X, that is, the 
normed linear space (M,||-||) is a linear subspace of (CX,||-||). We have already seen 
some examples of this in the exercises following Section 3. In this section we present 
some of the fundamental facts about subspaces. The first lemma shows that open 
linear subspaces are rather uninteresting. 


5.5.1 LEMMA. Let M be a linear subspace of a normed linear space X. If M is 
open (as a subset of X), then M = X. 


Proof: Let x be any point in X. We want to show that x e M. Since Misa 
linear subspace, the origin 0 lies in M@. So we can assume that x # 0. Since M is open, 
there is a local neighborhood U of 0 that lies in M. That is, there is an ¢ > 0 such 
that if ze X and ||z|| < «, then ze M. It follows, then, that for any xe X 


€ 
2 |x| 


lies in M. Since M is a linear subspace, the point x = 2||x\le~' zliesin M. J 


XxX 


The case where M is closed is far more interesting. 
Using the fact that addition, scalar multiplication and norm are continuous 
Operations, one can easily establish the following result. 


5.5.2 THEOREM. The closure of a linear subspace is a closed linear subspace. 


Needless to say, the fact that we get a closed set is immediate. What has to be 
shown is that this closed set is also a linear subspace. 

Now let M be a linear subspace in a Banach space X. We know already that M 
is a normed linear space, and we may ask whether M is a Banach space. That is, is 
M complete? The answer is simple, elegant, and not too surprising. 


5.5.3 THEOREM. Let M be alinear subspace of a Banach space X.Then M is a 
Banach space if and only if M is closed. 


EXAMPLE 1. Exercises 2 and 3 of Section 3 gave two examples of closed linear 
subspaces of the Banach space (/,, ,||-||,.). As a matter of fact, the space (¢o,|\'||,) 


230 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


is a closed linear subspace of (c,||:||,,). Exercise 5, Section 3 gives an example of a 
linear subspace of a Banach space that is not closed. § 


EXAMPLE 2. Consider the real space C[0,1] with the norm 


Is ={fiscor ar) 


This space is not complete. Let M denote the collection of all functions x € C[0,1] 
of the form x(t) = asin 2nt + b cos 2zt, where a and 6 are scalars. Then 


i 
Ix||27 = i) (a” sin? 2nt + 2ab sin 2nt - cos 2nt + b? cos? 2nt) dt 
0 


= }(a? + b?). 


We ask now, is the space M complete? We can show that it is by showing that it is 
isometrically equivalent to a complete metric space, namely C, the complex 
numbers, considered as a two-dimensional real normed linear space, with the norm 
lz = la + ib] = (a? + By”. 
2 

The identification is simply x(-) — a + ib. Since C is complete, it follows that M is 
complete by Theorem 3.13.6. 

In this case M is a two-dimensional subspace of the given normed linear space. 
In Section 10 we shall show that every finite-dimensional linear subspace, of a 
normed linear space, is complete. J 


If we have a proper linear subspace M in a normed linear space X, a little 
reflection shows that it is possible for M to be dense in X. For example, if X = /, 
and M is the linear subspace made up of all sequences with at most a finite number 
of nonzero terms, then M is a proper subspace of X and dense in X. On the other 
hand, if M is closed, the only way it can be dense in_X is to be equal to X. In other 
words, if M is a proper closed subspace of X, then there are points a nonzero 
distance from M; and this brings us to an important geometric concept. 

In ordinary three-dimensional Euclidean space, a vector x is orthogonal to a 
plane M if and only if d(x,M) = ||x||. Refer to Figure 5.5.1. In normed linear 


xy 
7) 


d({x,,M)= Ixy 
(x) ) xy d(x,,M)< Ix,ll 


| 
) 
: 
>< 


M 
Figure 5.5.1, 


5.5. LINEAR SUBSPACES 231 


spaces in general there is no concept of orthogonality (because there is no inner 
product). However, if M is a proper linear subspace of a normed linear space X, 
we can still ask if d(x, M) = ||x|l. It is tempting to argue on the basis of geometric 
intuition that if M is closed (so M will not be dense in X) and X is complete, then 
we can always find an x in X — M such that d(x, M) = ||x||. Unfortunately, this is 
one of those places where geometric intuition can go wrong. All we can say in 
general is stated in the next theorem. 


5.5.4 THEOREM. (RIESZ THEOREM.*) Let M be a proper closed linear subspace 
of a normed linear space X, and let ¢ > 0. Then there exists an x € X with ||x|| = 1 
such that d(x,M)> 1 — «. 


Carefully note that this theorem does not claim that there exists a unit vector 
x such that d(x,M) = 1. Sometimes there is, but then again sometimes there is not 
(see Example 3 below). We shall see subsequently that this pathology cannot occur 
in Hilbert spaces. 


y=z-m 


M 
Figure 5.5.2. 


Proof: It goes without saying that 0 < d(x,M) < ||x|| for all x e X. Since M 
is closed and not all of X, there are x’s such that 0 < d(x,M) < ||x|| [that is, 
d(x,M) is nonzero]. The issue is to find an x such that (1 — ¢) < d(x,M) < ||x||. In 
other words, we want to find an x that is ‘“‘ almost orthogonal’ to M. If M = {0}, 
the proof is trivial, so we assume M contains some nonzero vectors. Since M is 
closed, there is a point ze X — M such that d(z,M) = 6 > 0. See Figure 5.5.2. 
Since 

d(z,M) = inf{||z — m'||: m' e M}, 


there is an me ™M such that ||z — m|| < 6(1 + €). Moreover, if y=z—~m, then 
d(y,M) = d(z,M) = 6. Thus, if a is any scalar, we have |lay|| < «(1 + €) and 
d(ay,M) = ad. (Why?) Finally, if we let x =(1/|lyll)y, then |x|) =1 and 
d(x,M)>1/l+e>1l—«e § 


* This theorem should not be confused with the Riesz Representation Theorem (Theorem 5.21.1). 


232 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


The next example shows that, in general, one cannot set ¢ = 0 in the above 
theorem. 


EXAMPLE 3. Let X denote the linear space made up of all real-valued con- 
tinuous functions x(t) defined on the interval 0 < ¢ < 1 and that satisfy x(0) = 0. 
Let ||-||,, denote the sup-norm. Then (X,||-|| ,,) is a Banach space. [It is a closed linear 
subspace of the Banach space (C[0,1],||-||.,).] Let / denote the linear subspace of X 
consisting of those functions x(t) that satisfy 


I(x) = [x dt = 0. 


We leave it to the reader to show that M is closed. (See Theorem 5.6.6.) 

We want to show that if x) € X with ||xo||,, = 1, then d(x,),M) < 1. That is, 
even though X is complete and M is closed, there is no x, that is “ orthogonal” 
to M. 

Suppose fora moment that we consider M as a linear subspace of (C[0,1], ||-||.,). 
Suppose further that x, is a point in C[0,1] with ||xo||,, = 1 and |x,(0)| = 1. Since 
y(0) = 0 for each y € M, it follows that ||xo — y||,, > 1 for all y in M. Then since 
|X>o — Ol|,, = 1, it follows that d(x) ,M) = 1. 

Next suppose that x, is a point in C[0,1] with ||x,||,, = 1 and |x,(0)| =« < 1. 
It follows that ||x, — y||,, >a for all ye M. Moreover, we can show that there 
exists a py € M such that a < ||x, — p||,. < 1. The basic idea is illustrated by Figure 
5.5.3. That is, « <d(x,, M) < 1. The construction of a j is rather tedious, so we 


+] 


& 
on 
0 


30) = 0 and 


In 2s 
fidwoa =0 


Figure 5.5.3. 


5.5. LINEAR SUBSPACES 233 


shall merely sketch it here. Assume that x,(0) = a > 0. Since x, is continuous, there 
exists a maximal nontrivial interval? A, starting at t = 0 such that —1 < x,(t) <1 
for all te A,. Then since ||x,||,, = 1 and x, is continuous, one has |x,(a@,)| = 1 with 
a, = sup A,. Note that a, € A,. Next we have the maximal closed interval B, = 
[a,,b,] upon which x(t) = xo(a@,). It can happen that a, = b,. Then if b, 4 1, we 
have the nontrivial interval C, where 0 < x,(t) < 1,1f x,(b,) = 1, or —1 < x,(t) < 0, 
if x,(b,) = —1. We construct § on J, = A, U B, U C, so that §(0) = 0, 


[ 9) adt=0 and — sup|x,(t) — $()| <1. 
I; tel, 


This construction is continued until we have defined # on all of [0,1]. 

Hence, for ||x||,, = 1 we have that d(x,M) = 1 if and only if |x(0)| = 1. 

Return now to the original problem where_X is the Banach space containing the 
closed linear subspace M. Since all the x € X satisfy x(0) = 0, it follows that there 
is nO X9 in X with ||x9|| = | such that d(x,,M) = 1. 

We leave it to the reader to show that if z, e X, where 


1 
nt, forO<t<- 
n 
Z(t) = | ol (Ps ar” Se 
1, for—-<t<1 
n 


then |/z,,|| = 1, and lim,..,, d(z,, M) = 1. (See Figure 5.5.4.) J 


Figure 5.5.4, 


MXERCISES 


|. Prove Theorem 5.5.2. 


. (Continuation of Exercise 11, Section 3.) Show that NBV (J) is a closed linear 
subspace of BV (J). 


\. Consider L,(7) with the usual norm |-||,. Let zeZ,(). Now show that 


M = {xeL,(1): |; xzdt = 0} is a closed linear subspace of L,(1). When is M 
a proper subspace? 


od 


“If A, — [0,1], then set p~ 0. Otherwise continue with the rest of the construction. 


234 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


4. Consider L,(/) with the usual norm ||-||, where 1 < p < o0. Let z € L,(J) where 
p '+q~'=1. Show that M = {x el): {, xz dt = 0} is a closed linear sub- 
space of L,(J). 


5. Consider L,(J) with the usual norm ||-||,,. Let zeZ,(7). Now show that 
M = {xeL,,(/): |, xz dt = 0} is a closed linear subspace of L,,(/). What hap- 
pens if we replace L, (J) with BC)? 


6. Let y,, ¥2,-..,y, be a finite collection of functions in L,(7), where L,(J) has 
the usual norm ||:||,. Show that 


M=(xeL,(): [xy d= 0,1 sis 
I 


is a closed linear subspace of L,(/). 
7. Show that one can choose ¢ = 0 in Theorem 5.5.4 when M is of finite dimension. 


8. Let X be a linear subspace in a Banach space B and assume that 


inf lly — x|| =d(y,X) < Cllyll 


for all ye B. Show that if C < 1, then X is dense in B. [Hint: Use the Riesz 
Theorem.] 


9. Show that the trivial subspace {0} is closed. 


6. CONTINUOUS LINEAR TRANSFORMATIONS 


The most important offspring of the wedding between topological and alge- 
braic structure is the concept of a continuous linear transformation. 


5.6.1 DEFINITION. Let Xand Y be normed linear spaces. A mapping L: X > Y 
is said to be a continuous linear transformation if it is (a) linear and (b) continuous. 


As the reader probably suspects, many of the examples of continuous transfor- 
mation in Chapter 3 are also linear. Likewise, many of the examples of linear 
transformations in Chapter 4 are also continuous. In particular, Examples 1, 4, 5, 
6, 7 of Section 5 in Chapter 3 and Examples 2 and 3 of Section 3 in Chapter 4 are of 
this nature. Let us consider some other examples. 


EXAMPLE 1. Let X = BC[0,0o), the normed linear space made up of all 


bounded continuous functions defined on [0,00) with the sup-norm ||x||,,. Let 7’ 
denote the operation of evaluating the “‘ running average,”’ that is, 


1 t 
(Tx\(1) = - [x dt. 


5.6. CONTINUOUS LINEAR TRANSFORMATIONS 235 


It can be seen (from L’Hospital’s Rule, for example) that 


lim [xo ar| = x(0). 


tO 


Moreover, Tx is clearly a continuous function of t, and 


5 fx de] <7 [xl de< Ile, 


for all t, so Tx is also bounded. Hence, T is a mapping of BC[0,0o) into itself. T is 
obviously linear. Let us show that T is continuous. Suppose that x9, x € BC[0,0) 
and that e > 0 is given. Then 


[Tx — Txo|| = Sup 


a) x(t) dt —- = f x00) dt 


< sup|- [xo — X,(t)| ar| 


< |x — Xoll. 
So ||x — xo|| < 6 = € implies ||Tx — Txo|| < ¢. It follows that T is continuous at x. 
Since x, is arbitrary, Tis continuous. J 


EXAMPLE 2. Let X be the normed linear space made up of all bounded ana- 
lytic functions f(z) defined inside the unit circle, that is, |z| < 1. We define the norm 
by 

if = SUP Ns 
Since fis analytic, it has a unique Taylor power series expansion 
F(Z) = dy +ayz+azz7 +°°° 
Let T be the mapping of X into itself defined by 
(Tf)(z) = Qo + a;Z, for |z| < 1. 


Now a) = f(0) and a, = (df/dz)(0) so T is linear. Indeed T is a projection. Let us 
show that T is continuous. 


We know from Cauchy’s integral formula for analytic functions that 


f(z) dz _ 
(20) = oI, (z — Z)"*!’ Ra 


where C is a closed curve within and on which / is analytic. Thus |ao| < || /|| and 
la,| < || ||. Therefore, |(7/)(z)| = lao + a,2z| < 2||f|| and so |7f|| < 2\| ||. It follows 
that iff, fo € X, then |7f— Tfol| < 2\|f—foll and T is continuous. 

In summary, then, Tis a continuous projection. fj 


236 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 
EXAMPLE 3. Consider the integral operator y = Kx, where 
y(t) = ( k(t,t)x(t) dt 
I 

and t, t belong to some interval J. Assume that 

B =| flkc2I? dt dt < 0, (5.6.1) 

ied 

where | <p < o. Then K represents a continuous linear mapping of L,(/) into 
L,1), where p> +q7* =1. 


In order to show this we shall assume that | < p < oo. (The case p = 1 is left 
as an exercise.) It follows from the Hélder Inequality for integrals (see Appendix A) 


that 
< feces) as| i ok as| 


[incor dts [ [fiacesye as]] f xcs as] "a 


< f flees ds ay for as| ~ 


In other words, ||y||, < B"?\\x\|, , where B is given by (5.6.1). It follows then from 
ly —yol, < BY’ lx — xol|, that K is a continuous linear mapping of L,(J) into 
L,(1). Of particular importance is the caseep=q=2. fj 


| { k(t,s)x(s) ds 


Therefore, 


EXAMPLE 4. Consider the integral operator y = Kx, where 
y(t) = { k(t —1)x(t)dt,  t€(—00,00). (5.6.2) 


This integral represents a time-invariant operator. We assume that k € L,(— 00,00). 
Let us now show that K is a bounded linear transformation of L,(— 00,00) into 
itself. Let y; = Kx,;, i= 1, 2. Then 


[ lyi(t) — yo(t)| dt = x in k(t — t)[x,(t) — x,(t)] dt} dt 


<| { |k(t — t)| |x,(t) — x9(t)] de dt. 
By interchanging the order of integration (see Appendix D) and using the fact that 
for every t one has [aes |k(t — t)| dt = ||K||,, one gets 


1 — Valls S< WAM, — Xl. 


Hence, K is continuous. 
More generally, if k ¢ L,(— 00,00), then (5.6.2) represents a continuous lineur 
transformation of L,(— 00,00) into itself. (See Exercise 17.) Jj 


5.6. CONTINUOUS LINEAR TRANSFORMATIONS 237 


EXAMPLE 5. In Example 2 of Section 3.5 we gave an example of a discon- 
(inuous operator which ts also linear. Let us consider here another discontinuous 
linear operator. 

Let X = C'{[—1,1] and Y = C[—1,1] be given with the sup-norm ||-||,,. Con- 
sider the differential operator D: x — dx/dt. Let us show that D is discontinuous at 
(he origin. Indeed if x,(t)= n~‘ sin nt, then ||x,||,, = n~!. However Dx, = cos nt, 
0 || Dx, ||. = 1. Hence D is discontinuous at x = 0. One can easily show that D 
1, discontinuous everywhere. J 


So much for examples. Let us now turn to some of the properties of continuous 
linear transformations. 
Recall that in Lemma 4.3.2 we showed that a transformation L: X > Y is linear 
i and only if 
L(oyXy + °°* + O04 X,) = a, L(x,) + °°: + a, L(x,), 


lor all x,,..., x, in X and all scalars a,,..., «,. There we carefully pointed out 
(hat we were only considering finite linear combinations. Now that we have topo- 
logical structure present we can consider the infinite case. 


5.6.2 THEOREM. Let X and Y be normed linear spaces, and let L be a transfor- 
mation of X into Y. The transformation L is a continuous linear transformation if 
und only if 


L( »y a; «) = sy QO; L(x;), (5.6.3) 
i=1 i=1 
fur every convergent series ) 7°, &;X;. 


Carefully note that we are only considering series ))2, «;x; that converge. 
lle reason should be obvious. Further note that if we choose to call the conclusion 
ul this theorem the principle of superposition, then the principle of superposition is a 
iharacterization of continuous linear transformations. 


Proof: The proof of this theorem is a direct consequence of Theorem 3.7.2 
wm Lemma 4.3.2. The former asserts that L is continuous if and only if 


L{( tim 5] = lim L(z,), 


lun’ every convergent sequence {z,}. 
First assume that L is linear and continuous, and let )| «; x; be any convergent 
wties in X. Then the sequence of partial sums 


inconvergent, and from Theorem 3.7.2 we have 


L{ lim 2] =lim L(z,). 


n> 0 no 


238 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 
By definition lim,.., Z, = 721 %;X;, SO 


L{ 3a x = lim L(z,,) 


n7o 


n ie @) 
=lim ¥ a, L(x;) = ¥ a; L(x). 
n>o i=1 i=1 

Now assume that (5.6.3) holds for every convergent series ) a; x;. We want to 
show that L is linear and continuous. Using series with only a finite number of 
nonzero terms and Lemma 4.3.2, we see that L is linear. So let us turn to showing 
that L is continuous. Let {z,} be any convergent sequence in X, and let x, = 
Z_ — Z,-1, Where Z) = 0. It follows that z, =) 7, x, and that the series }7_, x, 
is convergent with lim,.... Z, = )721 X;. It follows from (5.6.3) that 


L( >: ») = ¥ L(x) 
or 
L( lim 2] =m Y Le». 
Then using the linearity of L, or (5.6.3) again, we have 
L (tim 2) = lim L( 5 x; 
= lim L(z,). 


Since {z,} was an arbitrary convergent sequence, it follows from Theorem 3.7.2 
that L is continuous. J 


EXAMPLE 6. Let us define J: L,[—2,x] ~L,[-—2,2] by y=Jx, whero 
y(t) = JG x(s) ds. This mapping is a continuous linear mapping. Since the sericy 
\ 1 (1/n) sin nt converges in L,, with the usual metric, one has 


t t 


al o | 
y -sinnsds => —| sinnsd 
J. & ssinns S | | sin ns S 


n=1 


ae | 
=) sS(cosnt—1). J 
n=1N 


EXAMPLE 7. Let {¢,,¢,,...} be a sequence of functions in the real spice 
L,(1) with the usual norm. Assume that J; ¢,(1)@,,(t) dt = Sym Where On, is the 
Kronecker function. Let a = {a,,a,,...} be a sequence of real numbers in /,. Then 
form the series 


a,P,. 


o = 


i 


oO 
a 


5.6. CONTINUOUS LINEAR TRANSFORMATIONS 239 


One can show that this series converges in L,(/) by using the Cauchy Test (Lemma 
5.4.1). Indeed, one has forn >m 


n 
a2 a; 9; 
t=m 


= fade t i + andy)? dt 


re d, [4.4560 dt = » a;a;0i; 


i,j=m 


n 
= la;|”. 


t=m 


Since this sequence {a,} is in 1, we know that lim, nso) =m la;|7 =0. Hence, 
the original series converges in L,(J). 

Let K: L,(/) > R be given by x > f, k(t)x(t) dt, where k is a fixed element in 
L(7). K is linear and by the Hoélder Inequality for integrals, from Appendix A, 
one has 


[Hote — x,(t)] ar < [fikcor at) {f peyco : ar ai\" 
< |All Ixy — xall 


It follows that K is continuous. Therefore, 
Ko = [k(O| Ya, 4400} dt = ¥ anf fA(DOACO at} 
I n=1 n=1 I 


At this point let us note a very important distinction between continuous and 
discontinuous linear transformations. So far in each of the examples in this section, 
the ratio 


|| Lx|| 
|x| 
is bounded for continuous linear transformations and unbounded for discontinuous 


linear transformations. We shall now show that this is, in fact, always the case. First, 
we need the concept of a bounded linear transformation. 


({Ix|| # 0) 


5.6.3 DEFINITION. Let L: X > Y be a linear transformation, where X and Y 
ure normed linear spaces. We shall say that L is bounded if there is a real number 
Af = 0 such that 


|Lx|| < M||x\l, 


for all x in X. 


Before we proceed, a word concerning the notation is needed. Since X¥ and Y 
ure allowed to be different normed linear spaces, one sometimes uses different 
notation for the norm on each space. For example, one might use ||-||, and ||-||y to 
denote these norms. However, where no confusion will arise, this distinction is 


240 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


dropped. In any event, in the above inequality ||x|| denotes the norm in X (since x 
is in X) and ||Lx|| denotes the norm in Y (since y = Lx is in Y). 

The ratio ||Lx||/||x||, where ||x|| 4 0, can be viewed as the “ gain”’ or “‘ amplifi- 
cation” of the operator L for the “‘input”’ x. It is not difficult to see that this gain 
is bounded if and only if ZL is bounded. Indeed, if L is bounded, then ||Zx||/||x|| < M; 
and if ||Lx||/||x|| is bounded, then ||Lx]| < {sup 40 |Lx|l/Ilxll} xl] = |x]. 

The next theorem—and it is a very important one—shows how boundedness is 
related to continuity for linear transformations. 


5.6.4 THEOREM. A linear transformation L: X + Y, where X and Y are 
normed linear spaces, is continuous if and only if it is bounded. 


Proof: Let us prove the “if” part first, that is, assume that L is bounded. It 
follows from the linearity of L and the definition of bounded that one has 


Lx — Lxol| = L(x — Xo)ll < Mx — xoll. 


By setting 6 = e/M we see that L is continuous at x,. Since x, is arbitrary we see 
that L is continuous. 

Now for the “‘ only if”’ part. Assume that L is continuous. Then L is continuous 
at 0. Thus for e = 1, there is a 6 > O such that ||Lx’|| < 1 whenever ||x’|| < 6. If x 
is any point in X, x £0, let x’ = Bx where B = 5\|x|]~*. Since ||x’|| = 6, one has 
||Lx’|| < 1. Therefore, 


1 > ||Lx"|| = ||L(6x)ll = |B] Ll 
which implies that 


1 1 
That is, (5.6.4) holds for every point x in X with x # 0. But it also holds for x = 0, 
so we see that L is bounded. (Carefully note that 6 is independent of x.) J 


The last theorem has a rather interesting mathematical interpretation. Since /. 
is linear one has L(0) = 0. Now if one examines the definition of bounded, one sces 
that L is bounded if and only if LZ is continuous at x = 0. The last theorem, then, 
asserts that a linear transformation is continuous (everywhere) if and only if it ts 
continuous at a single point, namely, 0. It is a simple exercise to show that 0 can be 
replaced by any other point x in X. Hence, we have the following result. 


5.6.5 LEMMA. Let L: X— Y be a linear transformation, where X and Y are 
normed linear spaces. If L is continuous at one point x in X, then L is continuous 
everywhere. 


Finally, it is interesting to note that if a linear transformation is continuous, 
it is uniformly continuous. This fact follows from the boundedness of the trans- 
formation. 


5.6. CONTINUOUS LINEAR TRANSFORMATIONS 241 


Theorems 5.6.2 and 5.6.4 are fundamental statements about continuous linear 


transformations. We urge the reader to master them before continuing. Moreover, 
we remark that we shall henceforth use the terms “continuous linear transforma- 
tion” and “‘ bounded linear transformation” interchangeably. 


We end this section by noting a simple but important fact about continuous 


linear transformations. 


5.6.6 THEOREM. JfL: X > Y isacontinuous linear transformation, then N (L), 


the null space of L, is a closed linear subspace of X. 


The proof of this theorem is left as an exercise. 


EXERCISES 


(, 


. Show that the mapping K in Example 3 is continuous when p = 1. 
. Show that a linear transformation is continuous if and only if it is uniformly 


continuous. 


. Let B, and B, be two Banach spaces and let X be a linear subspace of B,. Let 


L: X > B, be a continuous linear transformation, 

(a) Show that L has a continuous extension L defined on the closure X. 
(b) Show that L is necessarily linear. 

(c) Show that Lis unique. [Hint: Use Theorem 3.14.3.] 


. Use the result of Exercise 3 to characterize the Lebesgue integral as an extension 


of the Riemann integral. (Compare with Appendix D.) 


. Let A, B be two continuous linear mappings of X into Y, where X and Y are 


normed linear spaces. 

(a) Show that A + Bis a continuous linear mapping of X into Y. 

(b) Show that for every scalar a, the mapping «A is a continuous linear 
mapping of X into Y. (Thus, in other words, the collection of continuous 
linear mapping of X into Y is a linear space. We shall show in Section 8 
that this is a normed linear space.) 


Show by means of specific examples that the Principle of Superposition fails 
for discontinuous linear transformations. 


. Show by examples that Theorem 5.6.4 and Lemma 5.6.5 fail for nonlinear 


mappings. 


. (a) Prove Lemma 5.6.5. 


(b) Prove Theorem 5.6.6. 


. (Closed operators.) Let B, and B, be two Banach spaces and let X be a linear 


subspace of B,. A linear operator L: X > B, is said to be closed if whenever 
x, >xin B, and y, = Lx, > yin B, one has x e X and y = Lx. Show that every 
continuous linear operator is closed. [Remark : There are closed linear opera- 
tors that are not continuous as is shown in Exercise 11. Also, do not confuse 
this concept with that of a closed mapping given in Exercise 5, Section 3.12. 
We shall return to this concept in Section 7.10.] 


242 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


10. 


11. 


12. 


13. 


14, 


15. 


16. 


Let Gr(L) be the graph of L, that is, 
Gr(L) = {(x,Lx): xe X} < B, x B,. 


(a) Show that Gr(Z) is a linear subspace of B, x B,. 

(b) Show that L is closed if and only if Gr(Z) is a closed linear subspace of 
B, x B,, where B, x B, has the norm ||(x,,x2)|| = ||x,|] + ||x.|. 

Let B, = B, = C[0,1] be given with the sup norm and let X¥ = C'[0,1]. Let 

D: X > C[0,1] be the differential operator D: x — dx/dt. [Note: D is linear and 

discontinuous.] Show that D is closed. 


Let X = C"[0,1] and let P(z) be a polynomial in z of degree n. Now we let 

P(D): X > C[0,1] be the associated polynomial differential operator. Show 

that P(D) is closed, where X and C[0,1] have the sup norm. 

(Continuation of Exercise 9.) Let B, = B, = L,(—i00,i00) be given with usual 

norm. Let f(iw) be any measurable, scalar-valued function defined on the 

imaginary axis (—joo,ioo). Let X¥ = {xeEL,(—io,ioo): fx eL,(—iw,io)} 

and define F: X > L,(—i00,io0) by y = Fx, where y(iw) = f(im)x(ia). 

(a) Show that F is closed. 

(b) Find conditions on fin order that X = L,(—ioo,ioo). Is F continuous in 
this case? (A special case of this problem occurs when / is continuous.) 


(Continuation of Exercise 9.) Let B, = B,=C(D), where D is the unit 
disk{(r,8): 0 <r < 1} in the plane, given with the sup-norm. Let X denotethe 
collection of C? functions u(r,@) such that u(1,0) = 0 for 0< 6 <2z, and let 
V7: X > C(D) be the Laplacian operator, that is, 

V2u = 07u/dr? + 1/r du/ér + (1/r7) 07u/d0?. 
Show that V? is closed. 
Let 0 <a <1 and define f(t) =1 


t, O<t<lI1/n 
f(t)=(t-*-(n— Dt, Ifn<t< 1f(n—-1) 
0, Ifm—-1)<t<l, 
for n=2,3,.... In which spaces L,[0,1] does the infinite series )'* , /, 


converge ? Compute 


f by f.(0) dt. 


Let p(x) be a nonnegative real-valued C®-function on R”™ with p(x) =0 for 
|x|21 and f{pmp(x) dx =1. (For an appropriate choice of the constant C the 
function 


0, |x] 2 1 


= 1 
p(x) C exp I, |x| <1 


5.7. INVERSES AND CONTINUOUS INVERSES 243 


satisfies these conditions.) For ¢ > 0 define the mollifier operator J, by 


x—y 
E 


Jeux) = 2" f p(—Juy) ay, 
Q 

where u(y) is defined in a bounded open domain Q in R”. 

(a) Show that if ue L,(Q) the J,(u) € Co*(R"), that is, J,(u) is a C®-function 
with compact support in R”™. 

(b) Show that J,: L,(Q) > L,,(Q) is a bounded linear transformation. 

(c) Show that ||J,u —ul|, +0 as e—>0 for each ue L,(Q). [Hint: First show 
that 


Jeu(x)—uCx)= fi oCE)[uCe — 08) — w(x) dé, 


Then show that (c) holds when uw is continuous. Finally, approximate an 
arbitrary win L, with a continuous function to get the general result.] 
(d) Show that if ue Cy°(Q), then 


D*(J,u) = (—1)'*"7,(D"u). 


17. Let k € L,(— 00,00). Show that the integral operator K given by 


y(t) = k(t — t)x(t) dt 
is a continuous linear mapping of L,(— 00,00) into itself, when 1 <p < oo. 
[Hint : Note thatif x(-)eL,(— 00,00), and z(-)e L,(— 00,00), wherep”' + q7' =1, 
then 


If fee —ax(a) de 2(0 a| < [kl llxllp Zila: 


Now apply Exercise 5, Section D.11.] 


18. Define L: L,(—0,00) > L,(—0,00) by L: x(t) x(-t). Is ZL linear and 
continuous ? 


7, INVERSES AND CONTINUOUS INVERSES 


Let L: X > Y be a mapping of X into Y. In Section 2.7, we saw that LZ has an 
inverse defined on its range if and only if L is one-to-one. In Section 4.4, we saw 
that if Z is linear, then it is one-to-one if and only if W(L) = {0}, that is, the null 
space is trivial. These considerations require only algebraic structure. However, 
when X and Y are normed linear spaces, one can inquire into the continuity of L 
and Lo. 

The first problem we would like to solve is: Find conditions on L that guarantee 
that L~': ®(L) > X exists and is continuous. 


244 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


5.7.1 THEOREM. Let L: X > Y be a linear transformation, where X and Y are 
normed linear spaces. If there is a constant m > 0 such that 


m||x|| < ||Lx\l, xe X, (5.7.1) 
then L has a continuous inverse L~‘ defined on its range B(L), 
L7': @L)> X 


and 


_ 1 
|L-'yll s ~ lyl. (5.7.2) 


for all y in QL). Conversely, if there is a continuous inverse L~': AL) > X, then 
there is a positive constant m such that (5.7.1) holds for all x in X. 


Note that we do not require the operator L to be continuous. Also, this theorem 
says nothing about the range of L; that is, (5.7.1) being satisfied does not allow us 
to say that L is or is not a mapping of X onto Y. In general, this question must be 
handled separately. 

A linear operator L: X > Y that satisfies (5.7.1) is said to be bounded below. 


Proof: First assume that (5.7.1) holds for some m > 0. In order to show that 
L~!: AL) > X exists, we must show that the null space (L) contains only the 
origin. This fact follows immediately from (5.7.1). Indeed, if x 4 0, then Lx # 0 
since ||Lx|| > m||x|| > 0. We now want to show that L~': @(L) > X is continuous, 
or equivalently (Theorem 5.6.4), bounded. If y € A(L), then there is an x € X such 
that y= Lx and x = L™'y. From (5.7.1), we get 


: 1 i 
JLo *yll = IIx] < — |Lx|| =— lyl, : (5.7.3) 
m m 


for all y in A(L). Hence, L~' is bounded. 

Now assume that L~*: @(L) > X exists and is continuous. Then there is an 
m > 0 such that (5.7.2) holds. By reversing the reasoning of (5.7.3), one easily 
verifies that (5.7.1) holds. J 


EXAMPLE |. Let 7 be a continuous linear mapping of a Banach space B into 
itself, and consider the linear mapping AJ — T, where / is a scalar. It follows from 
the triangle inequality that 


(AL — T)x|| > [lx] — Tl], 


for all x € B. , 
Since T is bounded, there is an M > 0 such that ||Tx|| < M||x||, so for |A| > A 
one has 


(AL — T)x|| = (Al — M) Ill. 


5.7. INVERSES AND CONTINUOUS INVERSES 245 


lfence, by Theorem 5.7.1, the mapping AJ — T has a continuous inverse defined on 
its range for sufficiently large |A|, in particular, for |A| > 4. Moreover, the mapping 


1 1 
K,(x) =aY <3 Tx, 


where y is a point in B, is a contraction mapping for sufficiently large |A|. (Why ?) 
So from the Contraction Mapping Theorem (Theorem 3.15.2) we know that K, has 
i unique fixed point x. Since y is arbitrary, we have that given any ye B there 
exists a unique x € B such that 


(AI —T)x = y. 
In other words, the range of (AJ — T) is B for sufficiently large |A|. Jj 


EXAMPLE 2. Let T(iw) be a continuous complex-valued function defined on 
the imaginary axis of the complex plane. It may be viewed as the transfer function of 
some physical system. We define an operator y = Tx formally by 


y(iw) =T(iw)x(iw), @ €(— 00,0). 
More precisely, we choose the domain of T to be 
Q(T) = {x € L,(—i0,io): Tx € L,(—iw,io)}. 


It follows that A(T) < L,(—ioco,ico). In this example we investigate some of the 
properties of the operator T. 
If |7(iw)| is bounded above, that is, |7(iw)| < B for all w, then 


ae ae oe Bee 2, 
| Tx\|* = oe a |T (iw)|* |x(iw)|* dw < or [ Ix(iw)|? deo = B* ||x\I?. 


We see, then, that if |7(iw)| is bounded above, then the operator T is bounded; 
moreover, we see that D(T) = L,(—io,ioo). 
If |T(iw)| > b > O for all w, then 


1 00 
|TxI? =| |T(i@)|? |x(i@)|? do > B? IIx, 


in other words, T is bounded below. In this case it follows from Theorem 5.7.1 that 
I'~* exists, is continuous and satisfies ||T~1y|| < 67 |x| for all y € A(T). In fact 
T'~ is given by x = T~'y, where 
; I 
x(iw) = Tia) y(ia). 

If |7(iw)| satisfies O < b < |T(iw)| < B < « for all w, then T and T~! are both 
linear and continuous with D(T) = A(T) = L,(—i«,io). 

What happens to T if |T(iw)| vanishes on some infinite interval w, < w < @, 
where w, < w,? If we set 
l, @<ao<a, 


x(iw@) = 
(i) 0, otherwise, 


246 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


that 1s, xX = Bro, ow, Where 2pq,.0,; denotes the characteristic function of [w,,@,], 
then Tx = 0. Hence x e W(T) and T is not one-to-one. More generally if |T(i@)| = 0 
on a set A with Lebesgue measure 0 < m(A) < oo, then T%, = 0 where 2, is the 
characteristic function of A. Hence, &,¢ NW (T). Since ||%,||? 40 we see that 
A(T) is nontrivial. 


The behavior of 7, where |7(i@)| vanishes only on a set of measure zero, Is 


discussed in the exercises. J 


EXERCISES 


I, 


This exercise refers to Example 2. Assume that |T(iw)| =0 at w=0. Let 

x, (iw) =n for |w| <n~* and x,(iw) = 0 otherwise. 

(a) Show that x, and Tx, are in L,(—io,ioo). 

(b) Compute ||x,|l. 

(c) Show that ||Tx,|| 0 as n—- co, thereby showing that T is not bounded 
below. 


. (Continuation of Exercise 1.) Let A = {w: T(iw) = 0}. Assume that 4 is finite. 


(a) Show that A(T) £ L,(—«,0). 

(b) Show that T~': A(T) > A(T) is well-defined but not continuous. 

(c) What happens if A is countable? 

(d) What happens if A is a general set of measure zero? 

(e) Assume that A is empty but that |7(i@)| is not bounded below. Discuss 
the behavior of T and T™'. 


. What happens in Example 2 if we do not assume T(iw) to be continuous? 
. Consider the space X = C'[0,7'] with two norms: 


Ilxlly = [lxll, = sup-norm, 
IIxllo = Ixtlan + [xls 


where x’ = dx/dt. 

(a) Show that the identity mapping J: CX,||:||,) ~ (G||-||,) is bounded below, 
but not continuous. 

(b) Use this fact to compare the topologies on these spaces, referring to 
Section 3.9. 


. In Exercise 2, Section 7.7 we will show that if fe C?[a,b], then 


° , 2d < 54 I ? 2d 2 : " 2 
[ror aes s4]——— | fol? at + ay? f Lseor at], 
Furthermore we note that if 


feC'[ab] with f(a) =0, 
then 


{ If(Ol? dt < (b — ay’ firor dt. 


5.8. OPERATOR TOPOLOGIES 247 


Consider the operator D?: u— u” on the domain 
QD?) = {ue C2[0,1]: u(0) = u(1) = 0}. 


(a) Show that the range of D? lies in C[0,1]. 
(b) Show that the equation u” =f has a unique solution u in Z(D*) when 


fe C[0,1]. 
(c) Let 


Ifo =(f YOR ar) 
and 


Ulla) = (lull) S u'll coy? ae ee" Il eoy 7 


be the norms on C[0,1] and GD”), respectively. Show that there is a 
real constant K such that 


lwo) S K|| D*ull,o), 


where uw is the solution given by part (b). 
(d) What can one say about the inverse of D?? 


(, (Continuation of Exercise 5.) Consider the operator D*: u— u“” on the domain 
Q(D*) = {ue C*[0,1]: u, wu’ vanish at 0, 1}. 


(a) Discuss the equation u“” = f where fe C[0,1], C[0,1] has the norm || f||,o), 
and Z(D*) has the norm 


4 ; 1/2 
lca) = (>, uO lhco) 
i= 


(b) What can one say about the inverse of D*? 


K, OPERATOR TOPOLOGIES 


The space Bit[_X, Y] of all bounded linear transformations of X into Y, where 
Vand Y are normed linear spaces, is itself a linear space. (See Exercise 5, Section 6.) 
We shall show here that it is also a normed linear space. However, before doing this 
let us consider an example. 


EXAMPLE |. Consider the (complex) normed linear space L,(/) with the usual 
norm ||-||. Let {¢,,@,,...} be a sequence in L,(/) with the property that |/¢,|| = 1 


und |, ¢,(t)¢,(t) dt =O when nm. Let {a,,02,...} be a sequence of complex 
numbers with }'°, |a,|7 < oo. Now define 


N ares 
ky(t,s) = don bAthp(s) N=1,2,.... 


248 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


The function ky(t,s), N = 1, 2,..., is a point in the Banach space L,(J x I). Since 
ley — kuell? = [[ lew(ts) — ku(t.s)/? de ds 
rer 
“ 
N 


= > lo,17, 


n=M+1 


Som dy(t)B,(5)| dt ds 
n=M+1 


it follows that {ky} is a Cauchy sequence in the complete space L,(J x I). Hence, 
the sequence {ky} is convergent in L,(J x I). Let 


K(t,8) = ¥, ty ba DBA) 


Let Ky and K be the integral operators 
(Ky x)(t) = [ ky(tss)x(s) ds, tT, 
I 


(Kx)(t) = [ k(t,s)x(s) ds, tel. 


It was shown in Example 3, Section 6 that both Ky and K are bounded linear trans: 
formations of L,(J) into itself and 


Kx] <All: Ix, Ky ll < Ay + Yl. 
Since K — Ky is an integral operator with kernel (or “unit impulse” response) 
k(t,s) — k,(t,s), one also has 
I|(K — Ky)x|| < Ik — kyl > [I . (5.8.1) 
and 


lim ||k — kyl] = 0. 


N->o 


In other words, the sequence of operators {Ky} converge to K in some sense. If wo 
restrict x to satisfy ||x|| < 1, we see from (5.8.1) that ||(K — Ky)x|| converges to zero 
uniformly. Let us now make this precise. J 


5.8.1 DEFINITION. Let X and Y be normed linear spaces and let T be a 
bounded linear transformation of XY into Y. We define the norm of T to be 
|7 || = inf{M: ||Tx|| < M||x|| for all x € X}. (5.8.2) 


It follows from the definition of boundedness that ||7'|| is finite. Furthermore, 
one obviously has 


|Tx| <7 - lx], xe Xx. (5.8.3) 


5.8. OPERATOR TOPOLOGIES 249 


We want to show that (5.8.2) defines a norm in the precise sense of Definition 
5.2.1. However, before doing this it is convenient to note some alternate formula- 
tion of ||7'||. 


5.8.2 LemMMA. Let X and Y be normed linear spaces and let T € BItLX,Y]. 
Then the norm ||T || agrees with all of the following: 


|Z"|| = sup{||7x|]: ||xl| < 1}, (5.8.4a) 

7 || = sup{||7x||: |]xl] = 1}, (5.8.4b) 
T 

|| 7 || = sup xs 0} (5.8.4c) 
x 


We leave the proof of this as an exercise. It should be noted that (5.8.4b) and 
(5.8.4c) are valid only when the space YX is nontrivial. 

Before we show that ||7'|| is a norm, let us give a physical interpretation of this 
number. Picture 7 as representing a physical system. Then the ratio ||7x||/||x|| can 
be viewed as the “‘ gain”’ or “‘amplification”’ of T at the point x. Roughly speaking, 
we see by (5.8.4c) that ||7'|| is the “‘maximum”’ gain, or better, the least upper 
bound of the gain. 


5.8.3 THEOREM. Let X and Y be normed linear spaces. Then Equation (5.8.2) 
defines a norm on BItL X,Y]. 


Proof: Referring to the definition of a norm (Definition 5.2.1) we see that 
properties (N1) and (N4) obviously hold. We can prove (N3) by using Equation 
(5.8.4). 


lla7’ || = sup{||a7x||: |||] < 1} = sup{la| |||]: |]xl] < 1} 
= |a| sup{||7>||: ||x|| < 1} = lal ||7'[. 
let us now prove (N2). Since we know that ||7x|| < ||7'|| - |x|] and ||Sx]] < || S|] - ||| 
for T, S € Bit, X,Y] one has 
I(T + S)x|| = Tx + Sxl] < Tx] + Sxl] < (ITH + SII 
It follows from (5.8.2) that ||T + S|| < |7T|| + |S. J 


We have thus shown that Bit[_X, Y] is a normed linear space. It is important to 
note that the norm of T does depend on X and Y and the norms on these spaces. 
l‘urthermore, it should also be mentioned that the norm is generally not easy to 
compute. Oftentimes one has to be satisfied with only a crude estimate. 

The next theorem shows that operator norms always have an additional 
property that norms in general do not have. 


5.8.4 THEOREM. Let T € BitLX,Y] and Se BItL Y,Z] where X, Y and Z are 
normed linear spaces. Then the composition ST is in BItX,Z] and 


[STIs [SI UTI. (5.8.5) 


250 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Proof: ST is obviously a linear transformation of X into Z. Since 
|STx|| < PSI + ZX < UST ITI ll 
we see that ST is bounded and that (5.8.5) holds. J 


It is possible to define other norms on B/tLX, Y]. One can sometimes choose 
these norms so that both (5.8.3) and (5.8.5) are satisfied. (See Exercise 3.) In this 
book we shall always use (5.8.2), or one of its equivalent formulations, as the defini- 
tion of the norm |7'||. This is referred to as the usual norm on BIt[|_X,Y ]. 

Sequential convergence in Bi/t[_X,Y] with respect to the given norm is referred 
to by a special term in the literature. 


5.8.5 DEFINITION. Let X and Y be normed linear spaces. A sequence {7,} in 
Bit_X, Y] is said to converge uniformly to T in BItLX,Y] if lim,..,, ||T — T,|| = 0. 
This is also referred to as convergence in the uniform topology or convergence in 
the operator norm topology. 


The ‘“‘ uniformity ’’ means that the convergence of {7,,} to T is uniform over the 
unit ball {x € X:||x|| < 1}. Needless to say, Example 1 exhibits this kind of con- 
vergence. 

One may ask whether Bi/t[_X, Y] is a Banach space. The answer, which we stato 
here for reference, is proved in the exercises. (See Exercises 4 and 16.) 


5.8.6. THEOREM. Let X and Y be normed linear spaces. Then BIt{.X,Y], with 
the usual norm, is a Banach space (that is, complete) if Y is a Banach space. 


Referring to Theorem 3.14.3, the next result should not be too surprising. 


5.8.7 THEOREM. Let D be a dense linear subspace of a normed linear space X, 
and let L: D— Y be a bounded linear mapping of D into a Banach space Y. Then 
there is a unique bounded linear extension L, of L defined everywhere on X, moreover, 
[Zl] = |Z. 


Proof: The first task is to define L,. Let x be any point in X, and let {x,} ben 
sequence in D converging to x. We know one exists because D is dense in X. Then 
Lx, — Lxmll < |LOc, — Xn)ll < ZI lx, — nll, so {Lx,} 1s a Cauchy sequence in 
Y. Since Y is complete, let y € Y be the limit of {Lx,}. It is not difficult to show thut 
y =L,x defines a continuous linear mapping of X into Y such that L,x = Ly for 
all x e D. We leave the remainder of the proof as an exercise. J 


There is another form of convergence of operators which we will need later, 
Before defining it though let us consider an example. 


EXAMPLE 2. Let X¥ = Y =L,(—ioc,ioo) with the usual norm. If xe X, we 
define y= P, x, n = 1, 2,3,... by 


x(iw), for liwl sn 


a 0, otherwise, 


5.8. OPERATOR TOPOLOGIES 251 


then for any x € _X, we have 
1 
Ix — P, x|? =— | Ix(ic)|? dw + 0, as n> c, (5.8.6) 
21 liw|>n 


where J is the identity operator. So we see that P,, x — Ix for all x € X. In some sense, 
then, the sequence {P,,} converges to f. Let us now compute the operator norm 
||! — P,,||. Let x, be any point in X such that ||x,|| = 1 and x,(i@) = 0 for |ia@| <n. 
Then P,, x, = 0, so U — P,)x, = x,. Since || — P,,)x,|| = ||x,|| = 1, we have ||J — P,| 
> 1 for all n. (One can actually show that ||J — P,,|| = 1.) Clearly, it is impossible for 
{P,} to converge uniformly to I. 

We see, then, that even though {P,$ converges to Jin the sense of (5.8.6), this 
sequence does not converge to Jin the norm on Bit[_X,X ]. On the other hand, con- 
vergence in the sense of (5.8.6) is important, so we give ita name. J 


5.8.8 DEFINITION. Let X and Y be normed linear spaces. We shall say that a 
sequence {7,,} in BItL X,Y] converges strongly to Te BU[LX,Y] if 


lim ||(7,, — T)x|| =0 
no 
foreach x in X. 
In other words, for each x in X the sequence {7,, x} in Y converges (in Y) to 
I'x. This is obviously a direct generalization of the familiar concept of pointwise 
convergence of functions. The notation used to signify this type of convergence is 


lim 7, = ,T, (5.8.7) 
(he “s”’ standing for ‘‘ strong convergence.” 
In Example 2 we showed that lim,..,, P, = ,/, but that {P,,} did not converge 
(o in the operator norm topology. We see, then, that a sequence may converge 
“trongly but not uniformly. 
The next lemma shows that the converse is not possible, that is, uniform con- 
vergence always implies strong convergence. (See Figure 5.8.1.) 


5.8.9 LEMMA. Jf the sequence {T,,} in BIt[_X,Y] converges uniformly to T in 
Kit X,Y], then it converges strongly to T. 


Proof: Since ||\(T — T,)x|| < ||T — T,|| - |x|], we see that lim ||T — T,|| = 0 
inplies that lim ||(7 — T,)x|| =0 for each xin X. J 


Needless to say, when we turn to infinite series of linear operators, we also have 
\niform and strong convergence. If the sequence of partial sums is strongly conver- 
nent, the series is strongly convergent. Similarly, uniform convergence of a series 
icans uniform convergence of the sequence of partial sums. If a series } *, T, 


252 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Set of all sequences 
in Bit |X, Y] 


Strongly 
Convergent Sequences 


Uniformly 
Convergent 
Sequences 


Figure 5.8.1. 


converges strongly to T we shall write this as )*_, T, =,7. Uniform convergence of 
a series shall be denoted in the usual fashion, namely ) 7, T,, = T. 
We end this section with a few more examples. 


EXAMPLE 3. Let H = C", the finite-dimensional space made up of all ordered 
n-tuples of complex numbers x = {€,,¢,,...,¢€,} with the usual inner product 


(x,y) = 610, + Ca¥. to°° + CV 


where y = {v,,v2,...,¥,} and ¥; denotes the complex conjugate of v;. The norm is 
given by ||x|| = ./(x,2). 

Let L be the bounded linear transformation of H into itself represented by the 
matrix equation ; 


y=Lx, 


where we use the symbol L to denote ann x n matrix as well as the linear transform- 
ation itself. Then™ 


|Lx||? = x™L Lx, 


where L’ denotes the complex conjugate of the transpose of L. Since L'™L is a 
positive (semidefinite) Hermitian matrix, the maximum of (5.8.2) over the unit ball 
exists and is equal to A,*, where A,” >/1,7 >--- >A,” are the eigenvalues of the 
matrix L'L. (The validity of this statement is shown in Chapter 6.) It follows that 
Z|] = lay]. 

It should be mentioned that some authors define the norm of annuxn 
matrix by 


5.8. OPERATOR TOPOLOGIES 253 


It can be shown that this norm can be expressed in terms of the eigenvalues of 
U'L as 


JL = {Ay? + ag? + + A,2 BY, 


Obviously, |\Z|| is rarely equal to ||L||. jj 


EXAMPLE 4. Suppose that we wish to approximate a pure time delay by a 
low pass filter whose transfer function is a ratio of polynomials, that is, a lumped 
parameter system. In particular, let B = L,(— 0,0) and let S, be the delay by t 
seconds operator defined by 


(S,x)(t) = x(t — 1), for t€(— 0,0). 
If A denotes the Fourier transform, then it is well-known that 
S,=F "AF, 
where A denotes the mapping of L,(—ioo,ioo) into itself defined by 
(AX (iw) = e~'*X(iw), 


where X = ¥x, that is, A is the operation of multiplication by the (transfer) 
function e ‘*. 

Now assume that the nth approximation to S, has a transfer function of the 
form 


(im + Z,)°°+ (im + Z,,) 
"(i@ + pi)*** (im + P,)’ 
where 7 > m are integers; k,, is a real number; z,,..., Z,, are complex numbers and 
zeros of A,(s); pi, P2,--+»P, are complex numbers and poles of A,(s). Assume 
further that the poles are in the left-hand plane and that the poles and zeros with 


nonzero imaginary parts occur in complex conjugate pairs, that is, if p is a pole (or 
zero), then p is a pole (or zero) also. Since n > m, it follows that 


A,(im) = k 


lim |A,(ie)| = 0, 
w@— +00 


that is, A,(iw) is a low pass filter. Then because 
ess sup |e '®* — A,(iw)| > 1 
it follows that 
A — A,|] = I, 


for n = 1, 2, 3,.... Thus, no matter how we select a sequence of approximations 
{A,}, it cannot converge uniformly to A. On the other hand, there are many 
sequences {A,,} that converge strongly toA. J 


EXAMPLE 5. (NEUMAN SERIES.) Suppose B is a Banach space and L is a con- 
tinuous linear mapping of B into itself. Further suppose that we are interested in 


254 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


the inverse, if one exists, of the transformation AJ — L, where / is a complex num- 
ber. We can carry out division formally as follows (see Example 1, Section 7): 


+—+L+ Pepe ie 
A A? Ak 
AI — LI 

1 
I-=L 

A 

age 

A 

1 1 

~-L-+=V 

A 


2 Lv and so on. 


Of course, this is just formal manipulation, and one has to show that it is meaningful 
in some sense or another. One thing we can do is show that the series converges to 
the inverse of (AJ — L) if |A| > ||L||. In this case the series 
et 
k 
ikem” 


k=0 


which is called the Neuman series, is absolutely convergent since 
JL IAS < LEA < 1. 


But B is complete, so B/t[B,B] is also. It follows from Theorem 5.4.2 that the 
above series is convergent in B/t[B,B]. Let T denote the limit. We want to show 
that T= (AI — L)~'. Let 


1 1 
= ae Ly N= 15 233s 
MiKo 

Then 


Tx =lim Tyx or T=,lim Ty. 


N-0o N- 0 


Since (AJ — L) is continuous, one has 


(AI — L)Tx =lim(AI — L)Ty x 


N-0o 


l 
= lim (1-5 Bt) <= x, 


N+1 
N+ 


for all xe B. 


We have shown that (AJ — L)T = I. Similarly, we can show that T(AJ — 1.) = 1, 
soT=(AI—L)™'. 


5.8. OPERATOR TOPOLOGIES 255 


We hasten to add that |A| > ||L}| is only a sufficient condition for (AJ — L)~! to 


exist and be continuous. It is not difficult to find cases where ||L|| > |A| and 
(AI—L)~' isin BIt{B,B]. § 


EXERCISES 


L; 
2. 


Prove Lemma 5.8.2. 


Consider R* with the norm ||-||,, as in Example 1, Section 3. Let T be a matrix 


operator 
a b 
r= (Ci. 


mapping (R’, ||:||,) into (R’, ||-|],)- 
(a) Compute ||T' || when 5 = c and p = 2. 
(b) Compute ||7 || in general. 


. Let M, denote the space of real n x n matrices. For A = (a,;) € M,, define 


n(A) = 2 la;;| . 


(a) Show here that ||Ax||, < n(A)||x|]|,, where x eR” and R” have the norm 
lll, =Soas bl. 

(b) Let A, Be M,. Show that n(AB) < n(A)n(B). 

(c) Let ||A|| be given by Definition 5.8.1 where X = Y = R" with the norm 
|x||,. Compare n(A) and ||A||. When are they equal ? 


. Prove Theorem 5.8.6. [Hint: Note that ||(7,, — T,,)x|| < ||T., — T,,|l |x|]. Use 


this to define 7 by Tx = lim T,,x, where {7,} is a Cauchy sequence. Show that 
T € Bit[_ X,Y ] and that ||7,, — T|| ~ 0 as n— oo. See Exercise 16, below. ] 


. In Example 2 show that for N # M one has ||P, — Py|| = 1. (Thus {Py} is not 


a Cauchy sequence in the norm on Bit[ X,X ].) 


. Let {7,,} be a Cauchy sequence in the usual norm on BitLX, Y]. Assume that 


T € BIit|_X, Y | and that lim 7;,, = ,T. Show that ||7,, — 7 || - 0 asn— o, in other 
words, lim T,, = T. 


. (Uniform Boundedness Principle.) Let {7,,} be a sequence in Bit|_X, Y] where 


X is a Banach space. Assume that {||7,, x||} 1s bounded for each x in_X, that is, 
sup ||7,,x|| < B(x) < ©, 
where B depends on x. Now show that {\|7,||} is bounded. [Hint: Let 


Am = {x € X: ||T,x|| < m for all n}. Show that A,, is closed and X = | )R-1 Am. 
Now apply Exercise 17, Section 3.13.] 


. Let {T,} be a sequence in Bit[X,Y] where X is a Banach space. Assume that 


for each x in X, the limit 7x = lim,..,. T, x exists. Show that Te BIt[ X,Y]. 
[Hint: Use Exercise 7.] 


256 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


9: 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


Show that the Uniform Boundedness Principle (Exercise 7) fails if the domain 
X is not a Banach space. [Hint: Let 7, = 7 be an appropriate unbounded 
operator. ] 


Explain why in Exercise 7 one does not have to assume that the space Y is a 
Banach space. 


(Open Mapping Theorem.) The object of this exercise is to prove the following: 

Let T € Blt] X,Y], where X and Y are Banach spaces. If T(X) = Y, then T is an 

open mapping, that is, T maps open sets onto open sets. Let B = {x € X:||x|| < 1} 

be the open unit ball in X, and nB = {nx: ||x|| < 1}, where n = 1, 2,.... 

(a) Show that if T(nB) contains a nonempty open sphere, then T is an open 
mapping. 

(b) Show that if the closure T(nB) contains a nonempty open sphere, then T 
is an Open mapping. 

(c) Show that Y=()?_, TB) =|)%, 7B) and apply Exercise 17, 
Section 3.13. 


(Continuation of Exercise 11.) Show that if 7 ¢ BitLX,Y] where X and Y are 
Banach spaces and if T is one-to-one with T(X) = Y, then 7 ~/ is a bounded 
linear transformation. 


(Closed Graph Theorem.) Let X and Y be Banach spaces and let T: D(T) > Y 
be a linear transformation, where Z(T) is a linear subspace of XY. Let Gr(T) be 
the graph of T in X x Y. Show that if both A(T) and Gr(T) are closed, then 
T is bounded. [Hint: Apply Exercises 11 and 12 to the mapping S: (x,Tx) - x.] 


Use the Closed Graph Theorem to show that if an unbounded linear operator 
is closed, its domain is not a Banach space. 


Let X = Y=L,(—©,0). Show that the set C of all causal bounded linear 
operators (see Section 2.8) is a closed linear subspace Bit[ X,Y] with the usual 
norm. Similarly, show that the set (T/) of all time-invariant bounded linear 
operators is a closed linear subspace of Blt(X,Y]. Do the same thing for 
A=) S13: 

Prove the converse of Theorem 5.8.6 that is, show that if Y is not complete and 
X # {0}, then Bitl_X, Y] is not complete. [Hint: Let {y,} bea Cauchy sequence 
in Y that is not convergent. Show that there exists a Cauchy sequence {7,} in 
BitL X,Y] and a point x in X such that 7, x = y, for all n. Then show that the 
sequence {7,} is not uniformly convergent in B/tLX, Y].] 


Let X and Y be normed linear spaces, and let L,,n =1,2,..., be continu- 
ous linear mappings of X into Y. Assume that L is a mapping of X into Y. 
Further assume that for each n = 1, 2,... there exists a M, > 0 such that 


[Lx — L, x|| < M,,||x\l, forall xe X. 


Finally assume that M, ~ 0asn — oo. Show that it follows that L 1s a continuous 
linear mapping of X into Y. (This exercise shows that if the sequence {/.,} 
converges ‘“‘ uniformly” its limit must be continuous and linear.) 


20. 


21. 


22. 


23. 


9. 


5.9. EQUIVALENCE OF NORMED LINEAR SPACES 257 


. (Continuation of Exercise 17.) Assume that 


lim ||L,,.x — Lx|| = 0, for all x € X, 


n-~ oO 


that is, {L,,} converges strongly to L. Show that L is linear. 


. Let B be the Banach space L,(—,0oo), and let L: B>B be defined by 


(Lx)(t) = kx(t + 7), where t > 0 and |k| < 1. Consider the difference equation 
(+ L)x=y. 

Show that the solution x for a given y is 

x(t) = y(t) —ky(t +7) + k’y(t + 20) — ke yp(t + 3t) + °°. 
(It is interesting to compare this exercise with Example 3, Section 3.15.) 
Extend Theorem 5.8.7 to the case where L: D— Y is a bounded linear mapping 
but Y is not assumed to be complete. 
In Example 4, it was stated that there do exist sequences {A,} that converge 
strongly to A. Construct such a sequence. 
Show that the Neuman series in Example 5 converges uniformly provided 
ILI] < [A]. 


Let S denote the collection of all operators S: L,(— 00,00) + L,(— 0,0) of 
the form 
S=a,S,, +°°° + 0,5, 


where S,: x(t) — x(t — t) is the shift operator. Show that there are bounded 
linear time-invariant operators on L,(— 00,00) that cannot be written as the 
uniform limit of operators in Y. [Hint: Apply the Fourier transform to Y and 
show that ¥ is mapped into the almost periodic functions in L,,(—i00,ioo). 
Now use the following two facts: (1) The collection of almost periodic functions 
is closed under uniform limits, compare with Besicovitch [1], and (2) there is a 
bounded linear time-invariant operator whose Fourier transform is not an 
almost periodic function, compare with Bochner [1, p. 144].] 


EQUIVALENCE OF NORMED LINEAR SPACES 


The reader should now be able to guess how one would define the concept of 


equivalence between normed linear spaces. Basically one wants a mapping that 
preserves both the algebraic and the topological structure. Let us make this precise. 


5.9.1 DEFINITION. Two normed linear spaces X and Y are said to be topo- 


logically isomorphic if there exists a continuous linear transformation ¢ of X onto Y 
such that the inverse @~ * exists and is continuous. In this case, the mapping @ is said 
to be a topological isomorphism of X onto Y. 


A stronger form of equivalence is sometimes employed with normed linear 


spaces, and almost always employed with Hilbert spaces. 


258 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


5.9.2 DEFINITION. Two normed linear spaces X and Y are said to be iso- 
metrically isomorphic if there exists a linear transformation ¢ of X onto Y such that 


ex] = |x| (5.9.1) 


for all x in X. In this case, the mapping ¢ is said to be an isometric isomorphism. 


Note that if (5.9.1) holds, then @ is bounded (that is, continuous), and ¢7! 
exists and is also an isometric isomorphism. 

It is important that the reader not read too much into these equivalences. 
‘**Topologically isomorphic”’ signals the fact that algebraic and topological struc- 
tures are essentially the same, but it says nothing about other structures that X and 
Y may have. Similarly, ‘‘isometrically isomorphic” says nothing about structures 
that X and Y may have in addition to their normed linear space structure. For 
example, one can consider the space of complex numbers as a real normed linear 
space. Then it is simple to show that C and R?, where R? is a two-dimensional 
Euclidean space are isometrically isomorphic. One isometric isomorphism ¢ would 
be the mapping which takes the complex number z into the ordered pair (Re z, Im z). 
However, C has an operation of multiplication defined on it, whereas R? does not. 
Another example would be the normed linear space .@"” of real-valued n xn 
matrices, where M = {m,,} denotes a point in .4” and 


1/2 
\M|| = (>: Ima) | 
Js 


It is easily shown that .@” and R™, where R” is an n?-dimensional Euclidean space, 
are isometrically isomorphic. However, there is an operation of matrix multiplica- 
tion defined on .4" but not on R™. 

It is possible to give a necessary and sufficient condition for two normed linear 
spaces X and Y to be topologically isomorphic. Quite properly this condition is a 
condition on a linear transformation between X and Y. 


5.9.3 THEOREM. Let (X,||-||x) and(Y,||||y) be two normed linear spaces. X and 
Y are topologically isomorphic if and only if there exists a linear transformation 
with domain X and range Y and two positive constants m and M such that 


m||xllx < loxlly< Mllxllx, xeX. (5.9.2) 


Proof:~ If Xand Yaretopologically isomorphic, then there exists a continuous 
linear transformation @ of X onto Y with a continuous inverse @ ': Y— X. This 
means that 

x= "ox and y= oo "ty 
for all x in X and all y in Y, and there exist positive constants m and M such that 
x|ly <M |[xllx, xExX 


: 1 
ke Vilx <= Ilylly, ye Y. 
Hence (5.9.2) holds. 


5.9. EQUIVALENCE OF NORMED LINEAR SPACES 259 


Conversely, if there exists a linear transformation @ of X onto Y that satisfies 
(5.9.2), then @ is continuous because it is bounded. Furthermore, Theorem 5.7.1 
guarantees that ¢~* exists and is continuous. Thus, XY and Y are topologically 
isomorphic. § 


EXAMPLE |. In Example2, Section 7 we constructed a mapping T by y = Tx 
where 
y(iw) = T(iw)x(io). 
We showed that if 7'(iw) satisfies 
0<b<(|T(ina)|<B<@ 


for all w, then T was a topological isomorphism of L,(—i0o,ioo) onto itself. J 


EXAMPLE 2. Let Xdenote the collection of all real solutions ¢(t),0 < t < l,of 
the differential equation x” — x = 0. Thus, @(t) = c,e' + c,e ‘ where c, and c, are 
real constants. Assume that X has the sup-norm |-||,,. We define a mapping 
T: X > R* by 


T(c,e' + c,e ‘) =(c,, C2). 
It follows from the standard theory of differential equations that T is an isomor- 


phism of XY onto R*. Let us now show that T is a topological mapping where R* has 
the norm ||(c,,c2)||, = |c,| + |e,|. Since 


sup |c,e' + c,e ‘| < sup |c,|e’ + sup|cz|e~* < e(|c,| + |e,]), 
where the sup is taken over 0 < ¢t < 1, we have 


en" NPllo < ITO, 


so T is bounded below. In order to show that 7 is bounded above we solve the 
equations 


oO) =e, + ¢2, 
o(1) =ce+c,e", 


for c, and c,. Using the fact that |6(0)| < ||¢||,, and |¢(1)| < |||, this leads to the 
estimate 


4e 
Tl], = ley] + leg] < —— J llolle- 
e—e 


Hence 7 is a topological isomorphism. 

In this example, 7 1s an isomorphism between two finite-dimensional normed 
linear spaces of the same dimension. It turns out that such mappings are always 
topological (see Theorem 5.10.5). J 


260 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 
EXAMPLE 3. Let Y denote the complex linear space of all complex-valued 
polynomial functions of a real variable, that is, each y in Y can be written as 
y(t) =o + ayt 2 eee a,t", 


where n= 0, 1, 2, 3,..., the coefficients {a),...,a,} are complex numbers and f 
is real. Define a norm on Y by 


lvl = l@ol + °° + la, |. (5.9.3) 


Let X be the linear subspace of Y consisting of all polynomials x(t) that satisfy 
x(0) = 0, that is, the coefficient a) vanishes. Define the mapping L: X > Y by 


Lx =L(ayt+-::+4,t") =a, + 2a,t+:°::+na,t"™}, 


that is, L is the differentiation operator. 
Let us show that L is not continuous under the norm given by (5.9.3). For 
example, define the sequence of unit vectors {x,y} by 


1 
ya tt tt to + ONY, N =1,2,3,.... 
Then 


1 l 
|| Lxy| =| 5G +264 + Ney =, (N+ 1-0, as N00, 


so L is unbounded. L is obviously one-to-one. Let us now show that L maps X onto 
Y. Let y=b) + b,t+°::+ 5," be an arbitrary point in Y. Since 


b, b 
Li bat — ft? eee a ett) = 5 
(bot +5 ay ved ms 


we see that the range of L is all of Y. It follows that L~': Y— X exists. Since 


ay, 
LDly=L Wag ts +a,t")=agttot+ dale 
n+1 


we have ||L~'y|| < |lyl], so L~* is continuous. Although L is an (algebraic) iso- 
morphism of the linear space X onto the linear space Y, it is not a topological 
isomorphism, since L is not continuous. Nevertheless, X and Y are topologically 
isomorphic. We just need to consider another mapping. Indeed, the mapping ¢) 
defined by 
Pat+:::+a,t)=a,+a,tt+-::+a,t" 

with 

pho + byt +7 +b,t") = bot + byt? +++ +50"? 
is a topological isomorphism of X onto Y. As a matter of fact, ¢@ is an isometric 
isomorphism. 


5.9. EQUIVALENCE OF NORMED LINEAR SPACES 261 


Let us show yet another normed linear space that is isometrically isomorphic 
to Y. Let us consider /,, that is, the normed linear space made up of all infinite 
sequences z = {Z,,Z,,Z3,...} of complex numbers such that 7%, |z,| < oo and 
with ||z|| = 72, |z,|. Let W be the linear subspace of /, made up of all sequences 
which contain only a finite number of nonzero entries. We remark in passing that 
W is dense in /,. We ask the reader to verify the mapping V of Y onto W defined by 


Vado + ayt+-°:+a,t") = {do,a,,...,a,,0,0,...} 


is an isometric isomorphism of Y onto W. J 


In Section 3.9 we studied families of metrics defined on the same underlying 
set, that is, (X,d,) and (X,d,). There we introduced the concept of equivalent 
metrics. Recall that two metrics are equivalent if and only if they generate the same 
topology. We shall say that two norms, ||-||, and ||-||,, on a linear space X are 
equivalent if the metrics generated by these norms are equivalent. The following 
corollary shows that this equivalence can be characterized in a straightforward 
manner. 


5.9.4 COROLLARY. Let X be a linear space and let \\-||, and ||:||, be norms 
defined on X. These norms are equivalent if and only if there exist positive constants 


mand M such that 
m||x\la < |xll, < M(x (5.9.4) 


for all x in X. 


Proof: Let TI be the identity mapping of CX,||-||,) onto (X,||:||,). Obviously, 
I~" exists. The mappings J and J~! are both continuous if and only if the topologies 
generated by ||-||, and ||-||, are the same. (Why?) But J and J~? are also both con- 
tinuous if and only if they are bounded, that is, if and only if (5.9.4) holds. (See 
Theorems 5.6.4 and 5.7.1.) J 


EXAMPLE 4. This is a continuation of Example 2. Let X denote the collection 
of all solutions ¢(t), 0 <t <1, of the differential equation x” — x =0. Let ||¢]|,, 
denote the sup-norm and 


1 1/2 
i6l2=(f e@?at) 
First we note that 
Ilo? =f lo at<f Idle? dt = Ibn. 
0 0 


If f(t) = ce’ + c,e ‘, then it is easily shown that 
lla? = 4(e? — Ney? + 2eyey + 4(1 — e~*)e,” (5.9.5) 
=f(¢1,¢2) 2 9. 


262 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


By standard analytic geometry we see that the level curves f(c,,c,) = constant in 
the c,c2-plane are ellipses centered at the origin. In particular we see that f(c,,c,) > 
0 when (c,c,) # (0,0). [Another way to see the last fact is that if f(c,,c,) =0 for 
some (c,,c,) # (0,0), then e’ and e~' would be linearly dependent over O<t< 1. 
This is absurd.] Let 


m? = min{ f(c,, 2): |e] + |e2| = 1}. 


Since fis continuous on a compact set, this minimum exists and is positive. We then 
have 


Illa? =m? = m*(I¢q| + leal)’, (5.9.6) 


provided |c,| + |c,| = 1. We now claim that (5.9.6) holds for all (c,,c2). It clearly 
holds for (c,,c,) = (0,0). So if (c,,c,) # (0,0) and a = |c,| + |c,|, then 


2 
Ci C2 _ 
—e+t—e'| dt 
a ao 


1 
0 


1 
7=| Ic,e’+c,¢e ‘|? dt=«? 
Iblln? =| lee’ + ere" | 
= 02f(2, 2) > atm? = m*(le1| + lead 
a a 
Finally, it follows from Example 2 that 


é 
blo < elles! + Leal) <— Illa. 


Hence ||¢||, and ||@||,, areequivalenton X. J 


EXERCISES 


1. Let X denote the collection of all real solutions @ of the differential equation 
x” + bx’ + cx =), 


where b and c are real constants. Assume that X has the sup-norm ||@||,, . Show 
that X is topologically equivalent with R* with the norm ||(c,,c,)|| = |e,| + ley]. 
(Hint: Distinguish between the three cases where the roots of r* + br +c=0 
are real and different, real and the same, and complex conjugates. ] 


2. (Continuation of Exercise 1.) Assume that coefficients b and c are complex 
and that X is the space of all complex solutions. Show that YX is topologically 
equivalent with C’. 


3. (Continuation of Exercise 1.) 
(a) Show that the norm ||¢||, =(f |@(¢)|? dt)'/” is equivalent to ||||,, on X. 
(b) Show that these norms are not equivalent on the larger space C[0,1]. 


4. Show that the norms |-||,, 1 <p < 00, are equivalent on R", see Example |, 
Section 3. [Hint: See Section 3.11.] 


5. 


5.9. EQUIVALENCE OF NORMED LINEAR SPACES 263 


(a) Show that (R?,||-||,) and (R’,||-||,,) are isometrically isomorphic. 

(b) Show that (R°,||-||,) and (R°,||-||,,) are topologically isomorphic but not 
isometrically isomorphic. [Hint: Sketch the unit ball {x: ||x|| < 1} in each 
of these spaces.] 


. A Banach space B is said to be the completion of a normed linear space X if 


there is an isometric isomorphism @: X > B with the property that the range 
o(X) 1s dense in B. Show that any two completions of a given normed linear 
space X are isometrically isomorphic. [Hint: Use Exercise 3, Section 6.] 


. (Continuation of Exercise |, Section 3.14.) The following steps will lead to a 


proof that every normed linear space (X,||-||) has a completion. Let Y denote 

the collection of all Cauchy sequences {x,$ from X. 

(a) Show that n({x,}) = lim||x,|| exists for every {x,} in Y. 

(b) Define a “new” equality on Y by saying that {x,} = {x,’} ifn({x, — x,’}) = 
0. Show that Y is a linear space in terms of this new equality and that 
n(-) is a norm on Y. 

(c) Show that ( Y,n(-)) is complete. 

(d) Show that the mapping @: x > (x,x,x,...) is an isometric isomorphism of 
X into Y and that @(X) is dense in Y. 


. (a) Show that /,(0,00) and /,(— 00,00) are isometrically isomorphic. 


(b) Show that /,(0,00) and 1,(—0,00),1<p< oo, are isometrically iso- 
morphic. 


. (a) Show that L,[0,1] and L,[0,0o] are topologically isomorphic. Are they 


isometrically isomorphic? 
(b) Let Zand J be two nontrivial intervals. Showthat L,(J) and L,(/),1 <p < ©, 
are topologically isomorphic. 


. Define the mapping 


U.: f(t) f(t + t) 


forte R. 

(a) Show that U, is an isometric isomorphism of L,(— 00,00) onto L,(— 00,00) 
forl<p<o. 

(b) Is U,a topological isomorphism of L,,(— 0,00) onto L,,(— 0,0)? 


. Let A be a bounded linear operator on L,(— 00,00) and define U, by 


; foe) - RAP 
U,=e4a ue 
n=0 nN. 


as in Exercise 4, Section 4, 

(a) Show that U, is an isometric isomorphism on L,(— 00,00). 

(b) Show that U, is an isometric isomorphism of L,(— 00,00) onto L,(— 00,0) 
forl<p<o. 

(c) Show that if {t,} is a sequence in R with lim t, = 0, then lim U,, =, /, the 
identity. 

(d) What happens to the above for p = «0? 


264 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


12. Let Q be a compact metric space and let 7: Q x R-Q bea continuous flow on 
Q, that is, x(x,0) =x, n(x,t+ 5) =xn(x(x,t),s) and z is continuous. Assume 
there is a probability measure p on Q with the property that u(A) = p(x(A,1)) 
for all t and every measurable set A. Let f: Q— R be given and define the 
mapping U,, —0o <t< o, by 


(U, f (x) =f (x(x,7)). 
(a) Show U, is an isometric isomorphism of L,(Q,y) onto itself for 1 < p <oo. 
(b) Show that if {t,} 1s a sequence in R with lim t, = 0, then lim U,, = I. 
(c) What happens to the above for p = «0? 
(d) What happens on C(Q,R) with the sup-norm? 


13. Let X and Y be normed linear spaces. Show that there may exist an isometry 
@: X > Y that is not linear. [Hint: See Exercise 11, Section 3.10.] 


14. Let X and Y be topologically isomorphic normed linear spaces. Show that X 
is complete if and only if Y is complete. 


15. (a) Characterize the family of all topological isomorphisms of the real space R? 
onto itself. 
(b) Show that if R? has the Euclidean norm ||x|], = (|x,|* + |x|7)!/2, then every 
isometric isomorphism of R? onto itself is a matrix operator of the form 


cos@ sin@g 
—sin@ cos@ 
for some 0. 


10. FINITE-DIMENSIONAL SPACES 


One should not be surprised to learn that the theory of finite-dimensional 
normed linear spaces is much simpler than that of general normed linear spaces. 
In this section we shall explore this simplification. We shall show, among other 
things, that all finite-dimensional normed linear spaces are Banach spaces (that 1s, 
complete), every linear subspace is closed, boundedness implies total boundedness, 
and all linear transformations are continuous. First, however, we need to prove 4 
lemma regarding the representation of a vector in terms of a basis. 

If {x,,...,X,} is a (Hamel) basis® in a finite-dimensional normed linear space 
X, we know from Section 4.7 that each x in X can be expressed uniquely in the form 


XOX ot + aX, 


where the «,’s are scalars. The following lemma shows that each coefficient a, 1s 41 
continuous linear function of x. We shall use this fact repeatedly below. 


© The algebraic concept of a Hamel basis for a linear space was introduced in Section 4.7. As long 
as we are discussing finite-dimensional normed linear spaces this is usually the concept of basin 
employed. Although every infinite-dimensional normed linear space has a Hamel basis, this purely 
algebraic concept is of limited usefulness. There are concepts of basis for infinite-dimensionial 
normed linear spaces that do combine topological and algebraic structure; however, with one 
extremely important exception, they are not discussed in this book. The exception, which is discusn 
ed in Section 17, is the concept of an orthonormal basis for a Hilbert space. 


5.10. FINITE-DIMENSIONAL SPACES 265 


5.10.1 LEMMA. Let X be a finite-dimensional normed linear space and let 
{x,,...,X,} be a basis for X. Then each coefficient «;, 1 <i<n, in the expansion 


X= OX, +°°°'+4a4,X, (5.10.1) 


is a continuous linear function of x. In particular, there is a constant M such that 
la;| << M\||x|| for 1 <i<nandallxe xX. 


Proof: Let 1;; X > F be the mapping of X into the scalar field F given by 
a, = /1(x), 1 <i<n. It is easy to see that /; is linear. We shall now show that such a 
constant M exists. 
If we can show that there is an m > 0 such that 


mo] + oe° + lott) S [lol] = lly +e + On Xa, (5.10.2) 


then it follows that |a,| << m7? |x|]. 
Let us first prove (5.10.2) for sets of coefficients {a,,..., a,} that satisfy the 
condition |a,| +-°:: + |a,| = 1. (See Figure 5.10.1.) Let 


A = {(,,...,4,) EF": la,] +o°° + lo,,| = 1}. 


Vectors x = Q,x, + Q) x, on 
the lines connecting the 
four vectors xX,, Xz, —X,, —X, are 
those such that |a,| + la, |= 1 


ie ae 


Figure 5.10.1. 


It follows from Theorem 3.17.20 that A is a compact set in F", where F” has the 
norm |e], = ba lor |. 
Let f: A > R be given by’ 


F(a). oe 5O,,) es ese a ants 2 Oy Xnll- 
This mapping fis continuous (see Exercise 3) and f> 0. Let 
m = inf{ f(a,,...,%,)! (@4,-- + 5%,) € A}. 


It follows from Theorem 3.17.21 that there is a point (a,°...,0,°) in A with 
f(0,°... 50,°) =m. (See Figure 5.10.2.) If m = 0, then a,°x, +++: + 4,°x, =0 and 
this contradicts the fact that {x,,...,x,} is a basis. It then follows that m>0 
wnd (5.10.2) holds for this value of m and for (a,,...,o,,) In A. 


‘Compare this argument with Example 4, Section 9. 


266 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Loc: of Constant 
Norm 


laf x,t a$xll =m 


Figure 5.10.2. 


For an arbitrary set of coefficients (a,,...,a,,), set B = |a,| +°-- + la,|. If B =0, 
(5.10.2) obviously holds. If 6B 4 0, then 


eres a eat Ob, X nl a b| 


71 y + a 
ot gee: 
po B 


ay 


= 6f(3,.-. 


= m(loy]+ers + lol). 


) > mp 


The next example shows an infinite-dimensional case where all the functionals 
], are continuous but where the set of norms {||/;||} 1s unbounded. That is, there ts 
no M > Osuch that |\/,|| < M for alli. 


EXAMPLE |. Suppose a sampled-data random process {x(1),x(2),x(3),...} 
can be modeled as the response of a filter whose input is a random process 
{n(0),n(1),n(2),...}, where the n’s are stochastically independent, E{n(i)} = 0, and 
E{\n(i)|?} = 1 for all i. In particular, we will now suppose that x(1) = n(0) and 
x(k + 1) = x(k) + a(k)n(k) for k = 1, 2,..., where {a(k)} is a sequence of real 
numbers such that 0 < a(k) < 1, a(k + 1) < a(k) for all k and lim,_,,, a(k) = 0. We 
can view this random process as a sequence in the normed linear space X made up 
of all complex-valued random variables x such that E{|x|*} < oo and E{x} =0) 
with ||x|] = VE{|x|?}. 

Let M denote the linear subspace of XY made up of all finite linear combinations 
of vectors in the set B= {x(1),x(2),...}. It should be clear that B is a Hamel 
basis for M and, consequently, M is infinite dimensional. The linear functional {/,} 
can be defined by /,(x(j)) = 6,;, where 6,; is the Kronecker function. One then has 


0, k#Ajork4j+l, 
L(n(j)) =) — la(k), ok =j, 
1/a(k), kK=jtl. 
It can be shown that each /, is continuous with ||/,|| = 1/a(k — 1). However, 


\|/,|] ~ co ask + oo, because afk) ~Oask—>oo, § 


5.10. FINITE-DIMENSIONAL SPACES 267 


The last example should not be misinterpreted. There are plenty of infinite- 
dimensional examples in which one has ||/,|| < M for all k. It simply does not occur 
in Example 1. 


EXAMPLE 2. (CONTINUATION OF EXAMPLE 1.) Let {y(0),y(1),y(2),...} be the 
set of random variables, where y(0) = n(0), y(k) = a, n(0) + B, n(k), and a,”? + B,? = 
|. Moreover, B;, > B,4, > 0 and lim,.,,, 6, = 0. It should be clear that the set { y(k)} 
is linearly independent. 

Let M be a subspace defined as above in Example |. Then the linear functional 


/, such that /,(y(0)) = 1 and 1,()(k)) = 0 for k = 1, 2,... is not continuous. This 
follows from the fact that lim, y(k) = y(0). J 


The first thing to be said about finite-dimensional normed linear spaces is that 
they are always complete. 


5.10.2 THEOREM. Jf a normed linear space X is finite dimensional, it is a 
Banach space. 


Proof: Let {x,,x2,...,x,} be any basis for X and let {z,} be any Cauchy 

sequence in X. We must show that the sequence {z,} is convergent. Let 
Zp = yey Xy HH hey Xp 
fork =1,2,.... It follows from Lemma 5.10.1 that there is a constant M such 
that 
lo,; — % | < M||z, — 2,||, l<j<n. 
llence each sequence of scalars {a, ;} is a Cauchy sequence. Since the scalar field (the 
real or complex numbers) is complete, the sequence {«,,;} is convergent. Let 
ny = lim, oo i,j = l, 2. ere 2 If we let Zo = 41% ae si Lor Xn then 
Z~ — Zoll = [oer — %o1)% + 77° + (Cin — Con) %nll 
S |oq1 — Gort yl] + 5° + [otkn — onl xn: 


It quickly follows that {z,} converges to Zz). jj 


5.10.3 THEOREM. Let M be a finite-dimensional linear subspace of a normed 
linear space X. Then M is closed. 


In particular, then, a linear subspace M of a finite-dimensional normed linear 
npuce X 1s always closed. 


Proof: It follows from Theorem 5.10.2 that every Cauchy sequence in M 
converges and has its limit in M. Since every convergent sequence is a Cauchy 
wequence, it follows that every convergent sequence in M has its limit in M. By 
upplying the Closed Sct Theorem, we see that M is closed. J 


268 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


It is undoubtedly true that one of the major simplifications arising from finite 
dimensionality is that a// linear transformations are continuous. 


5.10.4 THEOREM. Let L: X > Y be a linear operator where X and Y are 
normed linear spaces. If X is finite dimensional, then L is continuous. 


Note that Y need not be finite dimensional. 


Proof: Let {x,,...,x,} be a Hamel basis for X. Then any point in x in X can 
be uniquely expressed as x = a,x, +°*' +4, X,, where the «’s are scalars. Let 
D = max{||Lx,||: 1 < i<n}. Then 


Lx |] = lloyLx, +++ + 0, Lxql] S fog] Loy |] + ++ + foal L,I 
< Di{la,|+-+°* + |o,/}. 
But from Lemma 5.10.1 there exists a constant M such that 
lal too + lo,| < M]xll, 


which implies that 
||Lx|| < DM |x]. 


Hence L is bounded or, equivalently, continuous. (Carefully note how this proof 
uses the fact that YX is finite dimensional.) J 


We know from Theorem 4.7.5 that two linear spaces (over the same vector 
field) are isomorphic if and only if they have the same (algebraic) dimension. In 
Section 9 we introduced the two concepts of topological and isometric isomor- 
phisms. In general it is not true that those latter two forms of equivalence are 
determined simply by dimension. However, one can say the following for the 
finite-dimensional case. 


5.10.5 THEOREM. Let L: X > Y be an isomorphism of X onto Y where X is 
finite dimensional. Then L is a homeomorphism, that is, L is a topological isomorphism. 
In particular, two finite-dimensional normed linear spaces (over the same scalar field) 
are topologically isomorphic if and only if they have the same dimension. 


This is a direct consequence of Theorem 5.10.4. Note that we say “‘topologic- 
ally isomorphic.”’ Two finite-dimensional spaces of the same dimension are not 
necessarily isometrically isomorphic. (See the Exercise 1.) 

The following is a very easily proven theorem. 


5.10.6 THEOREM. Let ||-||, and ||-||, be two norms on a finite-dimensional linear 
space X. Then |\:||, and ||-||, are equivalent. 


Let us now turn to a topological characterization of the algebraic concept of 
finite dimensionality. 


5.10. FINITE-DIMENSIONAL SPACES 269 


5.10.7 THEOREM. Let X be anormed linear space and let D = {x € X: ||x|| < 1}. 
Then X is finite dimensional if and only if D is compact. 


Proof: First assume that X is finite dimensional and let D = {x: ||x|| < 1}. 
Let {z,} be a sequence in D. We want to show that there is a convergent subsequence 
with limit in D. Let {x,,...,x,} be a basis for X and let 


Zk = ny X41 + ogre + Xin Xn 
It follows from Lemma 5.10.1 that there is a constant M such that 
lopal Hoos + lonnl < M ||z,i| < M. 


Since the coefficients (a, ;,...,%,) lie in a closed bounded set in F”, we can find 
i. subsequence that converges in F", say that 


(Opr15+ +» &k'n) > (Ko1,- ++ son): 
One then has z,-— Z) = %1X, +°°* + Qo, X,- Since ||Z || = lim ||z,-|| < 1, we see 
that z) € D. Hence D is compact. 

Now assume that D is compact. We shall show that X has finite dimension by 
contradiction. Let x, € X with ||x,|| = 1. Let M, be the linear space generated by 
{x,}.If dim Y > 2, then by the Riesz Theorem (5.5.4) we can find a vector {x,} in X 
such that ||x,|]| = 1 and ||x, — x,|| = 4. We now proceed by induction. Assume that 
we have chosen vectors {x,,...,x,} with the property that ||x,|| = 1, 1 <i<n, and 
|x,;-— x,ll 24 for i#j. Let M, be the linear space generated by {x,,...,x,}. If 
dim X¥ >n+1, then we can find a vector x,,, in X such that ||x,4,|| = 1 and 
|Xn+1 — X;|| > 4 for 1 <i<a. In this way, if dim X 1s not finite, we can construct a 
sequence {x,,x,,...}in D with the property that ||x; — x,|| => 4 fori j. But sucha 
sequence has no convergent subsequence and we have contradicted the fact that D 
iscompact. § 


Recall that a metric space is said to be separable if it contains a countable set that 
is dense in the metric space. It is easy to see that any finite-dimensional normed 
linear space is separable. Indeed, if {x,,x2,...,x,} 1S a basis, the set made up of all 
v's of the form x = r,x, +°°: +7,X,, where the r’s are rational or have rational 
real and imaginary parts, is a countable dense set. 


EXAMPLE 3. Let X and Y be normed linear spaces, and let L: X¥ > Y bea 
continuous linear transformation. Suppose that the range of L, A(L), is finite 
dimensional. Y itself may be infinite dimensional. Then L has an interesting prop- 
erty, namely, it maps bounded sets in X into compact sets in Y. In order to see this let 
A be any bounded setin XY. It follows that there is an r > 0 such that A c B,[0], the 
closed ball of radius r. Since LZ is continuous, L(A) c B,.[0], where r’ = ||L\|r. But 
1(A) < AL) O B, [0], and from Theorem 5.10.7 the set A(L) a B,[0] is compact. 
Thus ZL maps bounded sets into compact sets. Jj 


We can conclude this section by observing that things are better with finite- 
dimensional normed linear spaces(!) 


270 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


EXERCISES 


1. Let X and Y be finite-dimensional normed linear spaces with the same dimen- 
sion. It follows from Theorem 5.10.5 that they are topologically isomorphic. 
This does not mean that they are isometrically isomorphic, as you are now asked 
to show. Let X = Y = R’ and define the two norms 

lly = lx] + |x.) and = [lx], = (lay)? + |x217)'”. 
Show that (X,||-||,) is not isometrically isomorphic with (X,]||-||,). 

2. Show that a normed linear space X is finite dimensional if and only if it has the 

property that every closed, bounded set 1s compact. 


3. Show that the function /f(a,,...,«,), defined in the proof of Lemma 5.10.1, is 
continuous. [Hint: Show that 
IF (o15- ++ 0m) — (Bis. ++ Ba)l S M(lo — By] + +++ + lon — Bal) 
for some M.] 
4. Prove Theorem 5.10.6. 
5. Let M and N be linear subspaces of a normed linear space X and define 
0(M,N) = inf{||x — yl]: xe M, yEN, ||x|| = llyl] = 1}. 
Show that if X is finite dimensional, then 6(M,N) > Oif and onlyif MAN = {0}, 
6. Show that the conclusion of Exercise 5 is false in infinite-dimensional spaces. 


7. Let X be a normed linear space with the property that every linear mapping 
L: X > Y is continuous. Show that X is finite dimensional. 


11, NORMED CONJUGATE SPACE AND CONJUGATE OPERATOR 


In Section 4.12 we discussed the algebraic conjugate of a linear space. Recull 
that the algebraic conjugate X/ ofa linear space X is the linear space made up of ull 
linear functionals on X. In the case of normed linear spaces one usually restricts 
attention to a linear subspace X’ c X/, called the normed conjugate or, simply, 
conjugate space. This linear subspace of X‘ is the one made up of all continuoun 
linear functionals. Since every linear transformation defined on a finite-dimensional 
space is continuous, one has X’ = X/ when X is finite dimensional. However, in 
general XY’ is a proper linear subspace of X/. 

In Section 4.13 we discussed the transpose of a linear transformation, where the 
transpose L’?: Yf > X! ofa linear transformation L: X > Y is defined by 


<x,L"y’) = <Lx,y’) 


for allx e Xand y’ ¢ Y“. The conjugate L’ of a continuous linear operator L: X-» ¥ 
is the continuous linear transformation L’: Y’ + X’ defined by 


(x,L’y') = (Lx, y'> 


5.11. NORMED CONJUGATE SPACE AND CONJUGATE OPERATOR 271 


for all x e X and y’ € Y’. Note the difference between L’ and L’. L’ is defined for 
wll linear transformations and L’ is defined for continuous ones. L? maps Y/ into 
X! and L’ maps Y’ into X’. If L is continuous, L’ is defined and is the restriction of 
L’ to Y’. Of course, if X¥ and Y are finite dimensional, the L’ = L’. 

There is a great deal that can be said about normed conjugate spaces and 
conjugate transformations. Moreover, what can be said is very useful in applications. 


We shall discuss only the Hilbert space versions of these concepts in Sections 21 
and 22. 


Part B 
Hilbert Spaces 


12, INNER PRODUCT SPACES AND HILBERT SPACES 


A nice thing about normed linear spaces is that their geometry is much like 
the familiar two- and three-dimensional Euclidean geometry. Inner product spaces 
and Hilbert spaces are even nicer because their geometry is even closer to Euclidean 
geometry. In particular, these latter cases include the concept of orthogonality or 
perpendicularity. This “‘ nicer’ structure of inner product and —especially—Hilbert 
spaces leads to remarkable simplifications. 

We begin with a definition of inner product. 


5.12.1 DEFINITION. Let X be a complex linear space. An inner product on X 
is a mapping that associates to each ordered pair of vectors x, y a scalar, denoted 
(x,y), that satisfies the following properties: 

_ (IPL) (x + y, Z) = (%,z) + (2); (Additivity) 
(IP2) (ax,y) = a(x,y); (Homogeneity) 
(IP3)° (x,y) = (yx); (Symmetry) 
(IP4) (x,x) > 0, when x #0. (Positive Definiteness). 


These four properties are to hold for any vectors x, y, z in X and any scalar a, 
It should be noted that the properties: 

(IPS) (x,y +z) =(x,y) + @,z); 

(IP6) (x,ay) = &(x,y); 
are immediate consequences of (IP1), (IP2), and (IP3). Furthermore, it may appear 
from (IP4) that we are tacitly assuming (x,x) to be the real. However, (IP3) implies 
that (x,x) is indeed real. Also note that (0,0) = (0,x) = (x,0) = 0 for all x in X. 


5.12.2 LEMMA. If (x, y) =0 for all ye X, then x = 0. 
Proof: (x,x)=0 implies x=0. J 


The notion of an inner product on a real linear space is defined similarly. In 
this case the range of the mapping (x,y) is in the real numbers, that is, the scalar 
field. Also, (IP3) and (IP6) are simplified. That is, (IP3) becomes (x,y) = (y,x), and 
(IP6) becomes (x,ay) = a(x,y). It turns out that in the majority of situations one 
uses complex inner product spaces. Therefore, unless stated otherwise, all spaces are 
henceforth complex. We also caution the reader that a few of the following results in 
this chapter are not valid when real spaces are used. 


® As usual the bar ~_ denotes the complex conjugate. 


272 


5.12. INNER PRODUCT SPACES AND HILBERT SPACES 273 


The archetype for a real inner product is the dot product of classical vector 
analysis, 


3 
(~y)=x- y= 2% Ji- 
i= 
Other examples of inner products are presented in the next section. 


5.12.3 DEFINITION. An inner product space is defined to be a linear space X 
together with an inner product defined on X. 


Our first task is to show how the inner product generates a norm. Specifically, 
we Shall show that the function 
lx] = (xx)? (5.12.1) 
defines a norm on X. But before doing this, we need an important inequality. 
5.12.4 LEMMA. (SCHWARZ INEQUALITY.) Let (x,y) be an inner product ona 
linear space X. Then 


(xy) < [xl yl, (5.12.2) 
where ||x|| and ||y|| are defined by (5.12.1). 


Figure 5.12.1. 


In the familiar two-dimensional vector analysis case shown in Figure 5.12.1 we 
know that for /, = ||x|| and J, = ||yl| one has x- y=J1,1, cos@ and obviously 
|x: y| <1,1,. The Schwarz Inequality is simply a generalization of this familiar 
geometric fact to inner product spaces. 


Proof: Wewillassume that X is a linear space over C, the complex numbers. 
The same argument applies to linear spaces over R. If either x or y is the origin. the 
Schwarz Inequality is obviously true. So assume x # Oand y # 0. If « is any complex 
number, then 


0< (x — ay,x — ay). 


274 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


In particular, if « = (x,y)/(y,y), then 


(x,y)? 
ll yll? 


0<(x —ay,x — ay) = ||x||? — 


One might suspect from Figure 5.12.1 that |(x,y)| = ||x|l ly|| if and only if x 
and y are collinear, that is, the set {x,y} is linearly dependent. Inspection of the 
above proof shows that this is indeed the case. This simple observation has an 
amazing number of applications, particularly in optimization. 


EXAMPLE 1. Let X = L,[0,7] and let F be the filter mapping Y into itself 
defined by 


(Fx) = fe --9x(2) dt, te [0,7]. 


Let (x,y) = [§ x(t) y(t) dt denote the inner product on X. 
Suppose we want to choose an input x € X such that (x,x) = 1 and (Fx)(T) 1s 
maximum. Let y(t) = e’, then 


(Fx)(T) = e7 "(y,x). 
So from the Schwarz Inequality 
\(Fx)(T)| < ee *|yIl Il 


with the equality being taken when x = cy, where c is a constant, that is, when x 
and y are collinear. In particular, then, the solution of our problem is 


2 
x(t) = jor é, te [0,7]. 


Moreover, 
(Fx\(T) =e"TV3fe?? - 1] =V}[1-e°77]. J 
We now want to show that ||x|| is a norm on X. 


5.12.5 THEOREM. Let X be a linear space with inner product (x,y) and define 
\|x|| by (5.12.1). Then ||x|| is a norm on X. 


Proof: Itis very easy to see that ||x|| satisfies properties (N1), (N3), and (N4) 
for a norm, and all one needs to prove is that ||x|| satisfies the triangle inequality, 
property (N2). We do this as follows: 


0 < |x + yl? = (x + y,x + y) = ||xI]? + 2 Re (x,y) + llyll? 
< |||? + 2\(x,y) + Ilyil’, 
and using the Schwarz Inequality we get 


lx + yll? s [lel]? + 2il yl + My? = Ciel + ib? 0 


5.12. INNER PRODUCT SPACES AND HILBERT SPACES 275 


Note that ||x + y|| = [|x|] + llyll or [lx + yl] =| llxll — [yl] | if and only if x and 
y are collinear. 

Since every inner product space has a norm, it is a metric space. We therefore 
adopt the following convention: 


Whenever one discusses the topological properties of an inner product space, this is in 
reference to the metric defined by 


d(x,y) = {(x — yx — y)}”. 


Given this convention, the first thing to note is that the inner product is a 
continuous mapping of the product space (X,d) x (X,d) into the scalar field. We 
can also now ask if a given inner product space is complete. Some are and some are 
not, so we make the following definition. 


5.12.6 DEFINITION. A Hilbert space is a complete inner product space. 


At this point the reader is probably willing to grant that, all else being equal, 
completeness is a desirable property for an inner product space to have. Actually, 
the case is much stronger. We shall see that Hilbert spaces have significantly 
‘better’? geometric structure than inner product spaces do in general. So carefully 
note in the rest of this chapter which theorems are true for inner product spaces and 
which require completeness. 

An important fact which we note in passing is that every inner product space 
has a completion. The concept of the completion of a normed linear space was 
defined in Section 9. The same concept is applicable in the case of inner product 
spaces. We shall discuss this further in the exercises in Sections 19 and 21. 

One fact of Euclidean geometry is the Parallelogram Law, which we illustrate 
in Figure 5.12.2. The next theorem shows that the parallelogram law holds for inner 
product spaces in general. 


5.12.7 THEOREM. (PARALLELOGRAM Law.) Jf X is an inner product space, then 


lx + yll? + lle — yll? = 2llxll? + 2I1yII? (5.12.3) 
for all x and y in X. 


D? + DS = 2S? + 283 


Figure 5.12.2, Parallelogram Law. 


276 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


The proof of this theorem is a simple substitution which we leave to the reader. 

It is also a fact that the converse of Theorem 5.12.7 is true. That is, if X is a 
normed linear space and its norm satisfies (5.12.3), then there is a unique inner 
product defined on X that generates the norm. In other words, X is an inner product 
space “‘in disguise’ if and only if its norm satisfies (5.12.3). 


5.12.8 THEOREM. Let (X,]||:||) be a@ complex normed linear space such that 
lc + yll* + lle — yl? = Ix? + 2I1yll? 
for all x and y in X. Then 
(x,y) = E{llx + yll? — Ix — yl? + ix + iy? —i lx —iyl?} 5.12.4) 


defines an inner product on X and |\x|| = (x,x)'/* for all x in X. Moreover, the inner 
product given by (5.12.4) is the only inner product on X that generates the norm. 


The proof of this theorem is left as a somewhat tedious exercise. 


EXERCISES 


1. Show that equality holds in the Schwarz Inequality (5.12.2) if and only if the 
set {x,y} is linearly dependent. 
2. Show that the Schwarz Inequality holds even when (IP4) is replaced by: 
*(x,x) > 0 for all x e X.”” When does equality hold in this case? 
3. Prove the Parallelogram Law, Theorem 5.12.7. 
4. Prove Theorem 5.12.8. Modify Theorem 5.12.8 for the case of a real normed 
linear space. 
5. (a) Use Theorems 5.12.7 and 5.12.8 to show that the complex space C? with 
the norm ||-||,, 1 <p < 0 and p $ 2, is not an inner product space. 
(b) Show that the “inequality ’’ in (5.12.3) can go either way. 
6. (a) Describe all possible inner products on the complex linear space C. 
(b) Show that the real linear space R? can have a (real) inner product that in 
included not under part (a). 
7. Let (x,y) be an inner product on a linear space X. 
(a) Show that for x fixed the mapping y > (x,y) is a continuous mapping of 
X into C. 
(b) Show that for y fixed x > (x,y) is continuous in x. 
(c) Discuss the relationships 


(x >») = ¥ (sy) 
(3x, ) -3G »y). 


[ Hint: Use the proof of Theorem 5.6.2.] 


8. 


5.12. INNER PRODUCT SPACES AND HILBERT SPACES 277 


Let X be an inner product space. A mapping q(x,y): X x X > Cis said to bea 
sesquilinear functional if 


(1) a(x, + xX2,¥) = 4(xy,y) + 9(%2,y), 
(ii) g(ax,y) = aq(x,y), 
(iti) 9(x,Y, + ¥2) = 4(%,1) + (%Y2); 
(iv) q(x,ay) = aq(x,y), 


for all x, x,, X2, Y, ¥j, V2 in X and «in C. A mapping of QO: X > C 1s said to 
be the quadratic form generated by q if O(x) = q(x,x). A sesquilinear functional 
is said to be symmetric if g(x,y) = q(y,x) The functional q is positive if q(x,x) > 0 
for all x. A sesquilinear functional is bounded if there is a real number k such 
that |g(x,y)| < k ||x|| - |ly|| for all x and y. The quadratic form Q is bounded 
if |O(x)| < K||x||? for all x. Let 


all = inf{k: |g@,y)| < k\|x|| - lly|| for all x, y} 
| O|| = inf{K: |O(x)| < K||x||? for all x}. 


(a) Show that 
g(x,y) = OG + »)) — OG& — y)) + 104 + ty) — iOQG& — iy). 


(b) Show that g is symmetric if and only if Q is real-valued. 
(c) Show that g is bounded if and only if O is bounded and that 


llqil < Ql < 2Illl. 


(d) (Schwarz Inequality.) Let g be positive, then show that 
Igy)? < Q*) QV). 


(We return to this exercise again in Exercises 2 and 8 of Section 23.) 


. Let x, y be in an inner product space X and assume that ||Ax + (1 — 4)y|| = |x| 


forallA,O<A<l. 

(a) Show that x = y. 

(b) What happens to part (a) if X has a norm but no inner product ?(This shows 
that spheres in inner product spaces do not have “ flat edges.’’) 


. Let {x,,x,} be a linearly independent set in an inner product space X. Define 


f. CR by f(&) = ||x, — «x2|I- 
(a) Where does f take on its minimum value? 
(b) Give a geometric interpretation of this. 


. Let {x,,x.,x3} be a linearly independent set in an inner product space X 


and assume that {x,,x.} satisfy (x;,x,) = 6,;,1 <i, j < 2. Define f: C? > R by 
SF (a,,02) = llayx, +42 x2 — X3||. Now show that f attains its minimum when 
a, = (x3,X,), 1 = 1, 2. (Compare this with the Gram-Schmidt process in Section 
17.) 


278 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


12. 


13. 


14, 


15. 


16. 


17. 


Let {x,,x,,...,X,} be vectors from an inner product space X and define 


(x1,X;) (X1,X2) ine (x1,x,) 
G(x1,X2 saad Xn) — det (x2 1) (x2 ,X2) a (x2 Xn) 
(x, X41) (x, ,X2) wae (x, Xn) 


Show that {x,,...,x,} 1S linearly independent if and only if G(x,,...,x,) 4 0. 


Suppose we consider the Banach space B/t[_H,H] made up of all bounded linear 
transformations of a Hilbert space into itself, where, as before, ||7 || is defined 
by ||7'|| = inf{M: ||Tx|| < M||x|| for all x € H}. Does it follow that 


IT + S|? + |T— S|? = 27 |? + 2IS|? 


for all 7, Se Bit]}H,H]? (The point of the exercise is to show that B/tLH,H | 
with the usual norm is not necessarily a Hilbert space.) 


Show that in a normed linear space X the following inequality holds for all 
x,yEeXx: 


2\l||? — 4llxll yl + 2Iyll? < Ix + yl]? + lx — yl? 
< 2\|x||? + 4] yl + 2K 11”. 


Compare this result with the Parallelogram Law. 


A normed linear space X is said to be uniformly convex if whenever {x,} and 
{y,} are sequences in X with ||x,|| = |ly,|| = 1 and ||x, + y,|| > 2, one also has 
|x, — Y,|| ~ 0. Show that every inner product space is uniformly convex. 


Let A: X > X, B: XX be linear operators on an inner product space Y, 


Assume that (x,Ay) = (x,By) for all x and y. Show that A = B. Is linearity 
important here? 


Let y be a fixed point in a Hilbert space H. Describe the operator L: H > H that 
satisfies Lx = (x,y)y. 


EXAMPLES 


EXAMPLE 1. Let X = C" and define 


(x,y) = pm Yi. 


This is the usual inner product on C”. J 


EXAMPLE 2. Let X = C[0,1] be the space of continuous complex-valucd 


functions on the interval [0,1] and define 


(x,y) = j, x(t)y(1) dt. 


5.13. EXAMPLES 279 


This is an inner product space where the norm is 


1 1/2 
Ix|| = ( fxr ar | 


As shown in Appendix D, X is not complete. J 


EXAMPLE 3. The last example was not a Hilbert space, but if we let ¥ = L,(J) 
he the space of complex-valued measurable functions x with |, |x|? dt < oo, then 
X is a Hilbert space when the inner product is given by 


(x,y) = { xy dt. 
I 
The integral here is the Lebesgue integral. (See Appendix D.) J 


EXAMPLE 4. Let p(t) be a real-valued, continuous function that satisfies 
() < p(t) < Bon TI. Let X be the complex space L,(/) and define 


(x,y) = | xip dt. (5.13.1) 


It follows that the product xy is in L,(/). Since p € L, (J), the integrand in (5.13.1) 
is defined for all x and yin L,(/). We leave it as an exercise to show that (5.13.1) 
defines an inner product on X and that X is a Hilbert space. 

The function p in (5.13.1) is a weighting function. It is not necessary to assume 
that p is continuous, in fact any bounded, measurable, real-valued, positive function 
inL(7) would suffice. J 


EXAMPLE 5. Let D be acompact region in the ¢,f,-plane, and let X = C(D) 
be the space of complex-valued, continuous functions defined on D. Let 


(x,y) = [fv dt, dt,. 


D 


Then X is an inner product space, but X is not complete. J 


EXAMPLE 6. The last example has an analog in higher dimensions. Let D 
le a compact region in R” and let X = C(D) be the space of complex-valued, con- 
(inuous functions defined on D. Let 


(x,y) = J x dt. 


This space is not complete, but the completion would be L,(D). (See Appen- 
dix D.) 


280 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


EXAMPLE 7. Let D be a compact set in R’ and let X = C?(D) be the space of 
complex-valued functions that have continuous second partial derivatives in D. If 


ue X let 
ae 
Ox, OX, Ox; 


Define 


Ou Ov Ou Ov Ou Ov 
= (fe ee 13. 
(u,v) [ E Gee os oT oe =| dx, (5.13.2) 


where x = (x,,X2,x3). This is clearly linear in u, also (u,v) = (v,u) and (u,u) > 0. 
Furthermore, if (u,w) = 0, then Jp |u|?dx = 0. Since u is continuous, this implies 
that wu = 0. Hence (5.13.2) is an inner product on x. The norm is given by 


jul = (f (uu? + 19a) dx) 


2 2 2 


Ou 
. | 


ax, 


Ou 
Ox; 


Ou 
0X3 


where |Vu|* = 


EXAMPLE 8. Let X¥ = C" with (x,y) defined by Example 1. Let J be a compact 
interval [a,b] and let U = C(U/,X) be the space of continuous functions u defined on 
I with values u(t) in X. If u = {u,,...,u,} and v = {v,,...,v,} define 


(u,v) = { (u(t), v(t) dt (5.13.3) 
= [fay + + 8) 


Then (u,v) is an inner product on U. The norm of a function u = {u,,...,u,} in 
given by 


jel = (flask? +--+ Ing at) 


One can replace J with a compact region D in R™ and let U= C(D,X). The 
integral in (5.13.3) then becomes a volume integral over D. J 


EXAMPLE 9. Let /, = 1,[0,00) be the space of sequences x = (x,,x,,...) of 
complex numbers such that }'72, |x;|? < oo. Define 


(x,y) = 2% Yi- 


Then this is an inner product on /,, and /, is a Hilbert space. This is referred to in 
the usual inner product onl,. § 


5.13. EXAMPLES 28] 


EXAMPLE 10. Let us consider the space /,(— 00,00) of all bi-sequences 
X= (...,X 2 ,X-1,X0,X1,X2,--.) Of complex numbers such that °°, |x;,|? < oo. 
Then 


oO 


(x,y) = De Xi Vi 


is an inner product on /,(— 00,00), and /,(— 0,00) is a Hilbert space. J 


EXAMPLE 11. Consider a probability space (Q,F,P) and let X denote the 
collection of all random variables x(w) with finite variance o7(x). That is, if E 
denotes the expectation, then « = E(x) is finite and 


o*(x) = E(|x — al”) 
is finite. An inner product is given on X by 


(x,y) = E(xy) 
and the induced norm is given by ||x|| = E(|x|*)'/?. The space X is then the Hilbert 
space L,(Q,F,P). (See Exercise 1 and Appendix E.) J 


EXAMPLE 12. (SOBOLEV SPACES.) This is a continuation of Example 12, 
Section 3. Let Q be an open set in R”™ and let ue C"(Q). Define a norm on u by 


1/2 
ull 2 = [. & |ptucor? ax | (5.13.4) 
This number may be +o. Let C*(Q) denote those functions u in C"(Q) for which 
|ull,, 2 is finite. It is easy to see that C*(Q) is a normed linear space, and in fact the 
norm on C"(Q) is generated by the inner product 


(u,v), = { ), D*u(x)D*v(x) dx. 
Q |al<n 
The completion of C"(Q) with respect to this norm is the Sobolev space H"(Q). 

Another Sobolev space we shall be interested in is Ho"(Q). This is defined as the 
completion of Cy) °(Q) (the space of C~-functions with compact support in Q) under 
the norm (5.13.4). 

Both of the spaces H"(Q) and Hp(Q) are very useful in the study of partial 
differential operators. We shall see some applications in Sections 7.6—7.8. For other 
upplications we refer the reader to Agmon [1] and Friedman [2]. 

It should be noted that in general, H"(Q) and AH,"(Q) are different spaces. 
(See Exercise 4 below.) J 


EXERCISES 


|. In Example 11, show that x(@) has finite variance if and only if x e L,(Q). 


2. In Example 11, let H denote the collection of all random variables x in L,(Q) 
with the property that E(x) = 0. Show that H is a closed linear subspace of 
L,(Q). 


282 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


3. The following exercises refer to Example 12. 

(a) Let {u,} be a Cauchy sequence in C"(Q). Show that for any a, |a| <n, 
{ D*u,} is a Cauchy sequence in L,(Q). 

(b) Show that every element in H"(Q) can be viewed as a function uw in L,(Q) 
with strong L, derivatives of order up to n, that is, there is a sequence {u,} in 
CQ) such that {D%u,} is a Cauchy sequence in L,(Q) for |x| <n, and 
u,—>uin L,(Q). 

(c) A function uw in L,(Q) is said to have a weak derivative u* in L,(Q) if 


{ (x)u*(x) dx = (—1)!*! { u(x)D*b(x) dx 
Q Q 


for all ¢ in Co®(Q). A function uhas weak derivatives of order up to n if it has 
a weak derivative u* for every «, |«| < . Show that if a function uw in L,(Q) 
has strong derivatives of order up to n, then it has weak derivatives of order 
up to n. 

(d) Let W"(Q) denote the class of functions in L,(Q) with weak derivatives of 
order up to n. For u, v in W"Q) let 


(u,v), = { Y D*uD*» dx. 
Q lal<n 
Show that W"(Q) is a Hilbert space with the above inner product. 
(e) Show that H"(Q) < W"(Q). (One can show that H" = W", see Meyers and 
Serrin [1] and Friedman [2].) 


4. Consider the Sobolev spaces H"(Q) and H,"(Q). Show that H°(Q) = H,°(Q) = 
L,(Q). dfn = 1, then H” is generally different from H,", see Friedman [2].) 


5. Let p satisfy 1 < p < oo and let n be an integer. Show that 


1/p 
inte = [fe weucsy? dx| 


is a norm on an appropriate space. (The completion of these spaces defines the 
Sobolev spaces H"?(Q) and H,"?(Q), see Friedman [2].) 


6. Let D* denote a partial differential operator with |a] <n, see Exercise 12, 
Section 5.3 for notation. 
(a) Show that D*: Cy°(Q) ~ L,(Q) is a bounded linear mapping where L,(Q) 
has the usual norm and C,°(Q) has the norm |-||,,2. 
(b) Show that D* has a unique extension to H,"(Q). 
(c) Show that D* can be similarly defined on H"(Q). 


14. ORTHOGONALITY 


As has already been mentioned, the geometry of Hilbert spaces is, roughly 
speaking, just a generalization of Euclidean geometry. The main reason for thin 
simplicity is that we have an inner product which allows us to introduce the conce))t 
of orthogonality. 


5.14. ORTHOGONALITY 283 


In order to motivate the definition let us refer again to Figure 5.12.1. The dot 
product of the two vectors in this figure is given by x- y =/,1, cos 0. If 1, #0, 
l, £0, then x: y = Oif and only ifcos 6 = 0. Thatis,x y=Oifand only if x and y 
are perpendicular, or orthogonal, to one another. Based on this observation we make 
the following definition. 


5.14.1 DEFINITION. Two vectors x and y in an inner product space are said to 
be orthogonal if (x,y) = 0. If x and y are orthogonal, this is denoted by x 1 y. 


We shall say that two subsets A and B of X are orthogonal if x 1 y for all x in 
A and yin B. This will be denoted by A 1 B. 

Animmediate consequence of Definition 5.14.1 is that the familiar Pythagorean 
Theorem is true in any inner product space. 


5.14.2 THEOREM. (PYTHAGOREAN THEOREM.) Jf x 1 yin an inner product space 
X, then 


lx + yl]? = Wil? + fyi. 
(See Figure 5.14.1.) 


Figure 5.14.1. 


Proof: Assume that x L y, that is, (x,y) = 0. Then 


lx + yl]? =(x t+ y, x+y) = Ix]? + Ox) + Oy) + Ily Il? 
= ||x||7 + |lyl?. I 


EXAMPLE 1. Suppose we make N measurements (N a positive integer) of a 
quantity S = S) + s, where So is a known constant, and s is random. Assume 
further that each measurement is corrupted with additive noise, so that each 
measurement is of the form 


m,=S+n,;, b= 1 2. cag IN: 


We can place this situation in a Hilbert space framework by treating s,n,,”,,..., 
Ny, ™,,M2,..., My as vectors in the Hilbert space H made up of all complex- 
valued random variables x (defined on some underlying probability space) such 
that 


E{|x|?} <0 and E{x}=0, 


284 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


where E denotes the expectation operation. The inner product on this space is 
defined by 


(x,y) = E{xy}, 


where y denotes the complex conjugate of y. In this case, x and y being orthogonal 
corresponds to x and y being uncorrelated random variables. 

We assume here that the random variables s,7,,7,,...,y are pairwise un- 
correlated, that is, orthogonal. Further, we assume that the variances satisfy 
(n,n) = (nz ,N2) = ++: = (ny .ny) =o? and (S,S) = x? = |S,|* +0,’ where o,? = 
(s,s). 

Then suppose that we want to form an estimate S of S by ‘“‘averaging” the N 
measurements 


1 
S=So +8 == {my +m +++ + my} 
1 
Sts Sees pm ease eae Hy) 


1 
=Sota isto tsi to + ty}. 


Here the vector {s+ s+::'+s5+n, +n, +:°::+ ny} has an interesting geome- 
trical interpretation. In particular, n, +n, +-°-:+ny 1s the sum of N pairwise 
orthogonal vectors of length o. If oy denotes the length of this sum, it follows from 
the Pythagorean Theorem that oy” = No”. On the other hand, s+ s+°::+4+-5 is 
the sum of s with itself N times, and |s +--- + s|/? = _N?o,’. It follows that the 
‘* signal-to-noise ratio” in S is given by 
JStS+-+-+ S|)? — N*{1Sol? +.6,7} | ne pH 
Imp toe tony? No? OF 

In other words, the more measurements (that is, larger N) the more “ signal" 
relative to “noise.” If we write 


S=S+D, 


where D denotes the error, then D is a random variable, D = (1/N){n, + °°: + ny] 
and 


|Di=—=. I 
JN 

One of the most striking differences between Hilbert spaces and Banach spaces 
arises in approximation theory. Consider the following three part problem: [ct 
M be a proper closed linear subspace in a Banach space B and let x, € B with 
Xo ¢ M. (See Figure 5.14.2.) We first ask, does there exist a point yy e M that is 
closest to xo, that is, ||xo — Voll < xo — y|| for all y Ee M? Second, is yo unique? 
Third, if a yo does exist how do we find it? 


5.14. ORTHOGONALITY 285 


Figure 5.14.2, 


Let 
6 = inf{\|xo — yll: ye M}. (5.14.1) 


Since M is closed in B, it follows (Exercise 18, Section 3.12) that 6 > 0. The follow- 
ing result is a direct consequence of the definition of the infimum function and 
is Closely related to the Riesz Theorem (Theorem 5.5.4). 


5.14.3 THEOREM. Let B bea Banach space and let M be aclosed linear subspace 
of B. Let X>9 € B and define 6 by (5.14.1). Then for each n > 0 there is aye M such 
that 


b< Xo — yl <0 +70. 


In other words, there exist approximations to x, in M such that ||x, — y|| is 
urbitrarily close to 6. Again, we see this follows from the definition of 6, that is, 
6 = inf{||xo — yll: ye M}. However, Theorem 5.14.3 does not say we can actually 
uchieve 6. Since B is complete and ™ is closed, one might suspect that this would be 
no problem. Alas, things are not this simple in Banach spaces. The next example 
illustrates this point. 


EXAMPLE 2. This is a continuation of Example 3, Section 5. Choose x, in X 
but not in M. Let a =§ xo(t) dt, then % #0 since xp ¢ M. If we can find a yo in 
M such that ||x> — yoll.. < Ilxo — yl], for all y in M, then 

Xo — Yo 
Zo = 
IXo — Yolloo 


has the property that ||Zo||,, = 1, 2) € M and for any y in M one has 


Xo — y ’ 
Zo — Ylleo -| o.- 9 xo — (Yo + Y'Vllao» 
IlXo 


Ae sires —y ees 
— Voll «0 Xo — olla 


where y’ = ||x> — yoll, y. Since yp + y’ belongs to M, one then has 


IZo-—yllo = |x Xo — Yoll. = 1. 
fr) 


rai Yolla 


But it was shown in Example 3, Section 5 that this is impossible. J 


The pathology of Example 2 cannot occur in a Hilbert space. 


286 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


5.14.4 THEOREM. Let H bea Hilbert space and let M be a closed linear subspace 


of H. Let Xo € H and define 5 by (5.14.1). Then there is one (and only one) yo € M such 
that 


IIxo — Yoll = o. 
Moreover, X9 — Yo 1 M, that is,(x9 — yo, y) = 0 for all ye M. Furthermore, yo is 
the only point in M such that x) — yo . M. 
Again we are faced with a familiar geometric idea. This theorem says (see 


Figure 5.14.3) that the unique point in M closest to x9 is found by ‘dropping a 


Yo~ Yo 


Figure 5.14.3. 
perpendicular from x, to M.” And, it would not be completely incorrect to say that 


this simple idea is the most widely applied piece of Hilbert space geometry. More- 


over, it is important to note that Theorem 5.14.4 is not true for inner product spaces 
that are not complete. 


Proof: First let us show that yo exists. Let {y,} be a sequence in M such that 
i 
6 < |lxo — y, || < 0+ (5.14.2) 


where 6 = inf{||xo — y||: y € M}. The first thing we do is show that {y,} is a Cauchy 
sequence. The geometric situation is sketched in Figure 5.14.4. The vectors ‘* below " 


Figure 5.14.4, 


5.14. ORTHOGONALITY 287 


the subspace M are shown in order to call attention to the parallelogram with 
‘sides’ (Xo — Ym)s (Xo — Y,) and “‘ diagonals” (y, — Vn), (2X9 — ¥m — Yn)» We know 
that lim, xo — Vl] = lim, |lXo — y,|| = 6, so the length of the sides is 
approaching 6. On the other hand, M is a convex set; therefore, (y, + y,,)/2 iS a 
point in M. This means that ||x9 — (y, + ¥m)/2\| = 6 or ||2x9 — y, — Yml| = 26. But 
then using the Parallelogram Law, we have 


[Yn — Ymll? + 12% — Yn — Yall? = 2llXo — Yall? + 2lhxo — Yall”, 
and by (5.14.2) we get 
[yn — Poll? < 2X0 — Yall? + 2X0 — Pull” — 48? 


Z pd 1 1 
<5 + +40(2 +) 0, 
n m n m 


as m,n—» oo. Thus {y,} is a Cauchy sequence in M. Since M 1s a closed set in a 
complete space, M itself is complete, so {y,} converges to a point yp € M. Moreover, 
Xo — Yoll = lim |[x9 — y,|| = 0, 

n> 
because the norm is a continuous function. 
Next let us show that yo is unique. Suppose yy and y,’ are distinct points in M 
such that 
Xo — Voll = Xo — Yo'll = 6. 
Then (yp + yo')/2 is in M. Again using the Parallelogram Law, we have 


2 


= 2 


2 2 


—_Yot Yo 


Xo Yo 
. D 


Z 2 


Xo Yo. 
2 2 


Yo-Yo 
2 


<x 


2 
xo — Yoll? is Xo — Yo Il” _ 


5%, 
2 2 


which is a contradiction. 


Finally, let us show that y, is the only point in M such that x9 — yo L M. Let 
y be any point in M. Then (yp + ay) € M and again from the definition of 6, 
5? < (Xo — Yo — &Y, Xo — Yo — 4Y) = IXo — Yoll* — 2 Re{aly, xo — ¥o)} + lal?O,y), 


where « is a scalar. Since ||xp9 — yol| = 6, we have 


0 < —2 Re{a(y,xo — Yo)} + lal7(y,y). 


Then letting « = B(x» — yo.y), where B is real, we have 


0 < —2B\(Xo — voy)? + B? (Xo — Yo¥)I7OY), 


which holds for all 8. But this implies that the coefficient of the linear term in P is 
zero, that is, (Xo — ¥o,v) = 0. Hence x9 — Wo L M. Nowsuppose that x9 — yo’ L M, 
where y,. € M. Then (Xo — Yo, V) = (Xo — Vo »¥) for all ye M. So (Vo — Yo's V) = 9 
for ye M. Hence, yo — yo’ L M. But (yo — yo’) € M. So Yo — Yo’ = 0. This shows 
that y, is the only point in M such that x» -yp LM. J 


288 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Let us illustrate an application of Theorem 5.14.4. It must be said immediately 
that this example far from exhausts the possible applications of this theorem. 


EXAMPLE 3. Suppose that we want to build a time-invariant linear filter /' 
which is modeled by 


(Fx(t)= [h(t — 2)x(2) de, 


where the “‘ unit impulse response”’ A is given by 


0, for —wo<t<0 
A(t) = (1, for 0<t<l 
0, for 1<t<o 


and x € L,(— 00,00). (See Figure 5.14.5.) Further suppose that we cannot build thin 
filter itself but must construct an approximation to it. This approximation, call it , 


Figure 5.14.5. 


will be chosen from those time-invariant linear filters whose unit impulse responsen 
or kernels 4 are of the form 


x 0, t<0 
ht) = (0 e'+a,te'+-::-+a,t" 'e* O<t 
1 2 n ’ — bys 
where n is a fixed positive integer and the «’s are scalars. We assume that F is chosen 
so that the integral 


face) — hyo? at 


is minimized. 

Now in the Hilbert space L,[0,00], the linear subspace spanned by the set 
{e~'te-*,...,t"- 1e~"} is a closed linear subspace M of dimension n. Our approxi- 
mation problem falls into the framework of Theorem 5.14.4. We want to find the 


5.14. ORTHOGONALITY 289 


point h, in M that is closest to A. It follows that we want to choose h —h, to be 
orthogonal to M, so we have the following equations 


(h—h,,4) =0, i=1,2,...,n, 
where h(t) = t'~‘e~*. That is, 
(h,h;) = 04(Ay,A;) + 02(h2 hy) + °°* + (My Ay), Il<i<n, (5.14.3) 


which is a linear system of nm equations and n unknowns {q,,...,0,$. The appro- 
priate a’s are the solution of this system of equations. (See Exercise 9.) Jj 


In the proof of Theorem 5.14.4 the completeness of H was crucial. The next 
example illustrates this point. 


EXAMPLE 4. Let us begin this example with a brief synopsis. We start with an 
incomplete inner product space X that is a dense subspace in a Hilbert space H. 
Then let z be any point in H that is not in X, and define a subspace M of X as all 
ye X such that (y,z) = 0. It is then shown that M is a proper closed linear subspace 
of X. Finally, it is shown that if x, isin X but not in M, then there is no yp € M such 
that 


Xo — Voll = inf{|lxo — yll: ye M}. 


Roughly speaking, if there were such a yg, then x9 — yo would be orthogonal to M. 
This in turn would mean that x9 — yo would be a scalar multiple of z, but z is 
not an element of X. Now for the details. 

Let X denote the space of all polynomials with complex coefficients. Thus 
if x(t) = )-?_ a,t' and y(t) =), b;t', define 


(x,y) = ¥ aby. 
X can be viewed as a dense linear subspace of /,. (How?) Now define 
M= [» ex: mC + 1)~*a; = o}, 
that is, z is the sequence {(i + 1)~*} e1,. If we define a mapping /: X > C by 


ioe (a ) - > +1)"2a;,, 


it is easy to see that / is a bounded linear mapping. Since M is the null space of /, 
we see that M is closed. Since / is not the zero mapping, it follows from Theorem 
4.12.2 that M is a proper subspace of X, and it has co-dimension one. 

Let x) € X with x, ¢ M. If there is a yp in M with 


Xo — Yoll = inf{llxo — yl: ye M3 > 0, 


then z) = X9 — yo is in X and not in M. Furthermore by using the argument of 


290 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Theorem 5.14.4 one can show that z,) L M. Since Zy is in X, we can write it inthe 
form 


Zo(t)=Cotctt-:: + cyt 


for some finite V. For 0 <i<N, let 


c(t) = (i+ 1)?t'— (N+ 2)70"*7. 


Since w; € M, one has z, L w;, for 0 <i< N. However, 


Zy 1 w;>0 = (Z9,0;) = efi + 1)? >¢; = 0. 


Hence z) =0, and this is a contradiction. J 


The following is an important consequence of Theorem 5.14.4. 


5.14.5 COROLLARY. Let Mand N be closed linear subspaces of a Hilbert space 


HT such that M< N and M#N. Then there is a vector z in N such that z #0 and 
zl mM. 


The proof is simple. In the notation of Theorem 5.14.4 we choose x, € N and 


Xo € M, and note that z = xy — yo, where yo is the point in M closest to xy, satisfies 
the conclusion of the corollary. Needless to say, this corollary, as Theorem 5.14.4, 
depends on H being complete. This corollary should be compared with the Riesz 
Theorem (Theorem 5.5.4). 


EXERCISES 


l. 


2. 


Find the point in the proof of Theorem 5.14.4 where one uses the inner product 
structure of H. Where does one use the fact that H is complete? 


Let K be a closed convex set in a Hilbert space H. 

(a) Show that there is one and only one point x, in K of minimum norm, 
that ||x || < ||x|| for all x in K. 

(b) Show that this fails in an inner product space that is not complete. Recall 
that a set K is convex if x,, x, € K implies that x = Ax,, +(1 —A)x, EK 
for all A, 0 < A <1. [Hint: Study proof of Theorem 5.14.4.] 


. Consider the Banach space R* with norm ||x||, = |x,| + [x5]. 


(a) Show that every closed convex set in (R’,||-|,) has a point of minimum 
norm. 

(b) Show, by example, that this may not be unique. 

(c) What happens with ||x||,, = max(|x,|,|x,|)? 


. Let X denote all those functions x(t) in C[0,1] with x(0) = 0, and assume that 


X has the sup-norm. Let K denote the collection of all functions x in X with 
Jo x(t) dt = 1. 

(a) Show that K is convex. 

(b) Show that there is no point in K with minimum norm. 


5.14. ORTHOGONALITY 291 


. Let X denote the linear subspace of L,[0,22] made up of all trigonometric 


polynomials of the form 
x(t) = » ay em 
k=-—n 
Let M be the subspace of X defined by 
2n 
M= [re x: | tx(t) dt = 0} 
0 


Let x) be in X with x, ¢éM. Show that there is no point yo in M with 
Xo — Yoll = inf{llxo — yll: ye M}. 


. Let L be a bounded linear transformation on a Hilbert space H with ||L|| < 1. 


Let x € H and let y, be the average 
I -1 
Vitro = ites Coca x]. 


The following steps will lead to a proof that there is a y in H with y = limy, 

in H. 

(a) Let K denote the smallest closed convex set in H containing {x,Lx,L7x,...}, 
and let y be the (unique) point in K of minimum norm. (Use Exercise 2.) 

(b) Choose ze K so that z= Vo a; L'x where «,;>0, Sy a, =1, =], 
and such that ||z|| < ||y|| + 6/2 where ¢ is some prescribed positive number. 
Let z, = (1/n)[z +-°:: +L" 'z]. Show that 


(c) Show that |ly, — z,|| < ¢/2, for n sufficiently large. 
(d) Show that |ly,,|| < ||y|| + ¢ for n sufficiently large and hence lim ||y,|| = ||y|. 
(e) Use the fact that y,, € K to show that lim y, = y in H. 


. (Mean Ergodic Theorem.) Let (0,4,) be a measure space and let JT: Q > Q be 


a one-to-one mapping of Q into itself and assume that T preserves measure, 
that is, u(A) = p(TA) for every measurable set A. For fe L,(Q) let 


1 
9a(X) = = f(x) + f(T) +00 + T XY]. 


Show that there is a function g in L,(Q) such that g = lim g, in L,(Q). [Hint: 
Define L: L, > L, by Lf(x) = f(Tx). Show that ||L|| = 1 and apply Exercise 6.] 


. Let ,(t) = e?*'”, where a is real. Let n be a fixed integer. Show that @, L @, in 


L,[0,1] if and only if a= m ¥ n, where m is an integer. 


. Show that (5.14.3) always has a solution for «,, «72,..., 4,-. Solve for the a’s. 
. Consider the space X of all continuous functions x(t) such that 


1 T ‘ q 1/2 
= — t t < ©. 
is = [tim =f Ix(or at} < eo 


292 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Define an inner product on X by 
| Bae pad — 
= (lim — t ; 
(x,y) = {lim == J x90 at 


Let $,(t) = e' where a is real. Show that (¢,,¢,) = 6,,, that is, @, 1 ¢, 
whenever a # b. 


11. Let z= x+y, where x 1 y. Show that (z,x) is real and (z,x) = ||x||?. 


15. ORTHOGONAL COMPLEMENTS AND THE PROJECTION THEOREM 


Let X be an inner product space and let M be any subset of X. Define M*, the 
orthogonal complement of M, by 


M+ = {xe X: (x,y) =0 for all ye M}. 


That is, M+ is made up of all points that are orthogonal to every point in M. We 
shall write x | M if xe M+. If M = @, the empty set, then M+ = X. 


EXAMPLE 1. Let H=/,, and also let M be the set {e,,e,,e3,e,}, where 
e, = {1,0,0,0,...}, e, = {0,1,0,0,...}, e3 = {0,0,1,0,...}, and e, = {0,0,0,1,0,...}. 
The orthogonal complement M* is the set of all sequences in /, of the form 
a= {0,0,0,0,¢ 5 sS6 567 a7 Se i 


EXAMPLE 2. Let (Q,¥,P) be a probability space, and let A be a set in F¥. 
Denote the complement of A by A’. Then let M be the linear subspace of 
L,(Q,¥,P) made up of all random variables x(w) such that x(w) = 0 (a.e.) in? A’. 
(See Figure 5.15.1.) That is, 


M= [x EL,(Q,F,P): ly \x(w)|? dP = of 


2 


Figure 5.15.1. 


° The abbreviation “a.e.” stands for the phrase ‘“‘ almost everywhere” whose technical meaning is 
discussed in Appendix D. 


5.15. ORTHOGONAL COMPLEMENTS AND THE PROJECTION THEOREM 293 


The orthogonal complement of M is given by 


Mt = {» e L,(Q,F,P): [ xo) 2qP = 0}. E 


EXAMPLE 3. Suppose that a probability space (Q,F¥,P) can be represented as 
the square [0,1] x [0,1] in R* with the usual Lebesgue measure structure. Consider 
the Hilbert space H = L,(Q,F¥,P), and let y be a point in H and assume that y 
does not depend on w,. (See Figure 5.15.2.) Since the random variable y does not 
depend on w,, the inverse image of any Lebesgue measurable set J in C has a strip 
form similar to that of the set B shown in Figure 5.15.2. In fact, the collection 
of all such inverse images generates a sub-o-algebra, denote it by #, of ¥. In 


Figure 5.15.2, 


this case let us assume that @ is maximal in the following sense: If A is any 
Lebesgue measurable set in [0,1], then [0,1] x A is in &. In any event, (0,4,P,) 
is a probability space, where Pg, is the restriction of P to @. Let M denote the 
Hilbert space L,(Q,4,P,). It should be clear that M is the linear subspace of H 
made up of all random variables x that do not depend on a@,. 

The orthogonal complement of M is the linear subspace of H made up of all 
random variables z such that 


1 
I, 2(W,,0>) dw, =f(w,)=0. (a.e.) (5.15.1) 
Indeed, z is in M7? if and only if (z,x) = 0 for all x € M, that is, 


1 1 1 
| | 2(W,,W2)X(@2) d(w, x W2) =| f(@2)X(@2) dw> = 0, 
0 *0 O 


294 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


where fis defined by (5.15.1). Since x can be any square integrable function, it follows 
that ze M* if and only if f(@,) = 0 (a.e.). A sketch of an allowable z is shown in 
Figure 5.15.3. 

The subspace M has an important interpretation in terms of conditional expec- 
tations. Suppose that x is any random variable in L,(Q,¥,P). The conditional 
expectation of x with respect to the given random variable y, which we denote by 
E(x), is itself a random variable in L,(Q,¥,P). In fact, it is in L,(Q,F%,P 4). 
Indeed, E(x) is (compare with Section E.5) the random variable in L,(Q,2,P y) 
satisfying the condition 


J Bo) dP» = J xaP 


for all sets Bin &. It follows that E” is a mapping of L,(Q,¥,P) into itself and the 
range of E” is M=L,(Q,4,Pg). We can also characterize the null space of L”, 


Figure 5.15.3. 


Suppose E(x) = 0, then 
| 0°dPry= i x(@,,@2) d(w, X @2) 
B B 
for all Be &. Since all B are of the form [0,1] x A, we have 


J, [[ x(,.0.) do,| dw, =0 


for all A, that is, [§ x(w,, @2) dw, =0. It follows from (5.15.1) that E*(x): 01 
and only if x e M+. This example is extended in Example 4, Section 16. J 


5.15.1 THEOREM. Let M be a set in an inner product space X.Then M' ina 
closed linear subspace of X. 


Proof: It is easy to see that M¢* is a linear space. In order to show that Af! 
is closed, let {x,} be any convergent sequence in M* with x, 7 x. If we show that 


5.15. ORTHOGONAL COMPLEMENTS AND THE PROJECTION THEOREM 295 


Xo € M-, it will follow from the Closed Set Theorem that M ~ is closed. But if y is 
any point of M, then 
(Xo,y) = lim(x,,y) = 0 


owing to the continuity of the inner product and Theorem 3.7.2. (See Exercise 7, 
Section 12.) J 


The following corollary is an obvious consequence of the last theorem. 
5.15.2 CoROLLARY. Jf X is complete, then M~ is complete. 
Let us consider some elementary properties of orthogonal complements. 


5.15.3 THEOREM. Let M and N be nonempty sets in an inner product space X. 
The following statements are valid: 


(a) If MCN, then N*~ << M+. 

(b) Mc Mt, 

(c) If MCN, then Mtt co N14. 

(d) Mt = Mitt. 

(e) IfxeMaM%, then x =0. 

(f) {0} = X and X+ = {0}. 

(g) If M is a dense subset of X, then M~ = {0}. 


Note that these results do not require completeness. 


Proof: 

(a) Let xe M and yeEN-. Since xe N, one has (x,y) =0. Since x is an 
arbitrary point in M, we have ye M*, or N?-c M?. 

(b) Let x be any point in M. By definition x L M~, so clearly xe M*+. 

(c) This follows by applying (a) twice. 

(d) Since Mc<M--, by statement (b), it follows that M*+**< M+, by 
statement (a). By applying (b) to M+, we get M+ c M1+++. Hence, M+ = M1t?t+, 

(e) If xe Ma M-, then (x,x) = 0, which implies that x = 0. 

(f) This is obvious. 

(g) If xe M+, then for all ye M one has 


ll — yl]? = [lel]? + ly? = lll? 


by the Pythagorean Theorem. Since M is dense we can choose y so that ||x — y|| can 
be made arbitrarily small. Hence x = 0. (Carefully note that this result does not 
imply that if 47> = {0}, then M is dense in Y. Such a statement requires, as we shall 
see, completeness.) ff 


296 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


EXAMPLE 4. It is shownin Appendix D that the space C)°(— 00,00), where 
Co~(— 00,00) is the subset of C°(— 00,00) made up of functions with compact 
support, is dense in L,(—00,00). Therefore, if M is any linear subspace of 
L,(— 0,00) that contains C,°(— 00,00), then M~* = {0}. A particular example of 
this is the space 


M = [» Ee L,(— 00,00): [ w* |x(w)|? dw < co} 


Since Co” < M, one does have M~* = {0}. This space M arises in the study of the 
Fourier transform of differential operators. J 


The orthogonal complement M ° is, of course, defined for any set @. However, 
we shall be primarily interested in the case where M 1s a linear subspace of X. In 
fact, M will often be a closed linear subspace of a Hilbert space. 


5.15.4 THEOREM. Let M be a linear subspace of a Hilbert space H. The follow- 
ing statements are valid: 


(a) M++ = M, where M is the closure of M. 
(b) If M is closed, then M*+ = M. 

(c) M~ = {0} if and only if M is dense in H. 
(d) If M is closed and M+ = {0}, then M = H. 


Proof: 

(a) Let N =M, where M isa linear subspace of H. Thus, M c N. By Theorem 
5.15.3(b), we have Mc M*++. Since M++ is closed (Theorem 5.15.1), one has 
M=NcM*~-+. If N #4 M++, then by Corollary 5.14.5 there is a nonzero vector z 
in M-++ such that z | N. (Here is where we use the completeness of H.) Since 
McN, one has z 1 M. Therefore, ze M+. But ze M+ q M?++ implies [Theorem 
5.15.3(e)] that z = 0, a contradiction. Hence, N = M+?. 

(b) Follows directly from (a). 

(c) The ‘‘if’”’ part is simply Theorem 5.15.3(g). Suppose now that M+ = {0}. 
Then M** = {0}* = H. It follows from (a) that H = M, so M is dense in H. 

(d) Follows directly from (c). J 


The next example shows that Theorem 5.15.4 does not hold if the space is not 
complete. 


EXAMPLE 5. In Example 4, Section 14 we showed that if z) 1 M,thenz, = 0. 
In other words, M+ = {0}. It follows from Theorem 5.15.3(f) that M*+ = X. On 
the other hand, M is closed but a proper subspace of ¥. Hence, M+* 4 M = M. 
Carefully note that this example illustrates the failure of (a), (b), (c), and (d) in 
Theorem 5.15.4 when X is not complete. J 


5.15. ORTHOGONAL COMPLEMENTS AND THE PROJECTION THEOREM 297 


The remainder of this section is devoted to showing that the plausible statement 
H = M + M’°* is indeed true for Hilbert spaces, see Figure 5.15.4. The preceding 
example shows that the same statement is not necessarily true if the space is not 
complete. 

To begin with, we have to say a few words about sums of subspaces in inner 
product spaces. If M and N are linear subspaces of X, the sum M + N is, of course, 
defined exactly as in Section 4.10. However, now we can investigate the topological 
properties of M + N. Specifically, if M and WN are both closed, what can be said 
about M + N? One might suspect that M + N is always closed. Unfortunately, this 
is not always true as we shall see in the exercises. However, if M 1 N, then we can 
say the following: 


it 
romttn 


Figure 5.15.4. 


5.15.5 THEOREM. Let M and N be closed linear subspaces of a Hilbert space H. 
If M 1 N, then M + Nis a closed linear subspace of H. 


Proof: Let {z,} be a convergent sequence in M + N with limit z = lim z,.We 
wish to show that ze M + N. It follows that z, = x, + y, where x, € M and y, € N. 
Since M L N one has 


IZ, 2." = Xn a so 5 a Pn = Vmll? 


by the Pythagorean Theorem. Hence {x,} and {y,} are Cauchy sequences in M and 
N, respectively. Since M and N are complete, the limits x = lim x,, y = lim y,, exist 
und are in M and N, respectively. It follows from the continuity of addition that 
z=x-+y. Hence, M+ Nis closed. § 


Two assumptions in the last theorem are critical: (1) H is complete and (2) 
M 1 N. We shall show in the exercises (see Exercise 2, of this section, and 
Exercise 11, Section 17) that without these assumptions M + N need not be closed. 
We are now ready to prove a key theorem of Hilbert space geometry. 


5.15.6 THEOREM. (THE PROJECTION THEOREM: FIRST VERSION.) Let M be any 
closed linear subspace of a Hilbert space H. Then H = M + M~. Moreover, each 
xéH can be expressed uniquely x =m+n, where me M and neM*, and 
lac]? = llr]? + In|”. 


298 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


It must be appreciated that this is an extremely important statement. We shall 
see in the next section that it guarantees the existence of a rich and useful supply of 
orthogonal projections in any Hilbert space. Moreover, a key difference between 
inner product spaces and Hilbert spaces is that this theorem is not true for inner 
product spaces in general. Completeness is needed. We should also mention that a 
similar statement for Banach spaces is also not true. In particular, suppose that M 
is a closed linear subspace of a Banach space B. It can happen, see Taylor [2, p. 242], 
that there is no closed linear subspace N disjoint from M such that B= M +N. 
Here again simple geometric intuition works for Hilbert spaces but not elsewhere. 


Proof: By Theorem 5.15.5 (which requires the completeness of H) we 
see that Y= M+ M*~ is a closed linear subspace of H. Since Mc Y and 
M+ c Y, one has, using Theorem 5.15.3(a), Y‘' co M+ and Y1cM1?-, that is, 
YtoM+aM"-. It follows from Theorem 5.15.3(e) that Y* = {0}. Hence 
Y = H by Theorem 5.15.4(d). 

It follows next from Lemma 4.10.1 that for any x € X there are unique points 
me M and ne M+ such that x = m+n. Since m Ln, the Pythagorean Theorem 
assures us that ||x|/? = ||m||? + |n||?. ff 


EXAMPLE 6. Let H = L,(— 0,00), and let S,: HH be the shift operation 
(by 1), that is, (S,x)(t) = x(t — t). Let xq be a fixed point in H, and consider the 
linear subspace M of H made up of all points x of the form 


X =O ,Xo(t—1T,) +°°° +4, Xo(t — 1,) = [o,S,, +> + 0,8, ]xo, (5.15.2) 


n=1,2,3,..., that is, all (finite) linear combinations of shifted versions of xo. 

It is a fact that M, the closure of M, has many important applications. For 
example linear subspaces of this form are used in the study of time-invariant linear 
systems. Hence, it is of interest to characterize the closed linear subspace M. We 
will use the fact that the Fourier transform ¥: L,(— 00,0) > L,(—io0,io) is an 
isometric isomorphism, see Example 6, Section 19, and Example 11, Section 22. 

Since ¥ is an isometry characterizing the subspace M in L,(— 00,00) is equiva- 
lent to characterizing the subspace = ¥(M) in L,(—ico,ioo). Since 


F[S,Xo](iw) =e *Ro(ia), 


where %, is the Fourier transform of x,, it follows that .Z is the closure of the 
subspace .4 of L,(—i00,io0) made up of all % of the form 


R(iw) = [ae +++ +4,e PO ]Xo(iw). 
For each * € L,(—ia,ioo) we define the support K(%) to be that subset of 
(—ioo,ic0) on which (iw) 4 0. That is, X(iw) = 0 for (iw) ¢ K(X). Recall that a 
function & € L,(—ico,ioo) is only determined up to a set of measure zero. Therefore, 


the support set K(X) is only determined within a set of measure zero. 
We now claim that a function 2 lies in @ if and only if 


K(2)¢ K(&) (ae) (5.15.3) 


5.15. ORTHOGONAL COMPLEMENTS AND THE PROJECTION THEOREM 299 


This means that 
{iw € K(2): iw ¢ K(Xo)} 


is a set of measure zero. 
Let us first show that if 2 lies in #, then (5.15.3) holds. Indeed, since 


2(iw) = [we +--+ +4,€7'°]Xo(iw) 


this is obvious. Next, if 2 lies in #, then 2 is the limit of a sequence {%,,} in . 
Since K(&,,) < K(X,), one then has K(2) < K(X). 

Next let us show that the converse is true, that is, let 2 e¢ L,(—io,i00) be chosen 
so that K(2) < K(&,). Since .@ is closed, it follows from the Projection Theorem 
that there is a unique point me .@ such that 2 — m | M. In particular, one has 


f(t) = rs [ (2Go) — mM(iw)]%o(iw)e'?* dw = 0 


for all t€(—00,0). However, f(t) is merely the inverse Fourier transform of 
[2 — m]&, , (compare with Example 6, Section 19). Since ¥ ~! is one-to-one, one has 
[2(iw) — m(iw)]X (iw) = 0, (a.e.). That is, 2(iw) = m(iw) (a.e.) on K(X). Since 
K(2) < K(&,), by assumption, and K(m) <c K(X,), by the argument of the last 
paragraph, we see that 2 = m. Hence 2€.@. 

If ¥ ~1 denotes the inverse Fourier transform, we then see that M = ¥~1(4#). 
Note that if (iw) vanishes only on a set of measure zero, then M =L,(—0,0). J 


EXERCISES 


1. Let M and N be linear subspaces of a Hilbert space H with M 1 N. 
(a) Show that M?+ 1 Nt?. 
(b) Is it true that M+ 1 N+, or Mtti 1 N+1+9 


2. Let X be the linear subspace of /, generated by the vectors 


11 1 
Loan aa re es ee 


where e; = (6,;,02;,...)and6,; = Kronecker function. Let M denote those vectors 

x = (xX,,xX2,...) in X with x,;,, =0 for all i and N denote those vectors with 

x2, = 0 for all i. 

(a) Show that M L N and that M and WN are closed. 

(b) Show that the vector (1,1/2,1/27,...) is not in M@ +N but it is the limit of 
vectors z, in M + N. (This shows that Theorem 5.15.5 fails if the underlying 
space X is not complete.) 


3. Let M,,..., M, be closed linear subspaces of a Hilbert space H with M; 1 M, 
for i # /. Show that M = M, +--+: + M, is closed linear subspace of H. 


300 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


4. Let Y denote the collection of all closed subspaces of a Hilbert space H. For 
M, Nin ¥ let MAN denote the usual intersection Mm N and define 


MvN=(MtaN>). 


(a) Show that MAN and MvN belong to & whenever M and N do. (This 
means that # is a /attice.) 

(b) The lattice Y is said to be modular if Mv(NAK) =(MVWN)AK when- 
ever Mc K. Show that ¥ is modular if and only if H 1s finite dimensional, 

(c) When does one have Mv N=M+4+N? 


16. ORTHOGONAL PROJECTIONS 


The algebraic concept of a projection was discussed in Section 4.11. Recall 
that P: X > X isa projection if (1) it is linear and (2) P? = P. It was shown that from 
three natural points of view this is a reasonable definition of projection. First 
(Theorem 4.11.2), given a projection P, its range A(P) and null space W(P) are 
disjoint linear subspaces with XY = @(P) + V(P). Secondly (Theorem 4.11.3), given 
two disjoint linear subspaces M and N with X¥ = M + N, there is a unique projection 
P such that @(P) = M and WV(P)=N. Thirdly (Theorem 4.11.4), given a linear 
subspace M, there is a projection P such that &(P) = M; in fact there may be many 
such projections. Now that X is a normed linear space, one can ask whether or not 
these projections are continuous. As it happens some are and some are not. Indecd, 
if the range and null space get arbitrarily “close together ”’ the associated projection 
is discontinuous. 

Orthogonal projections on inner product spaces are a particularly important 
class of continuous projections with rather obvious geometric antecedants. 


5.16.1 DEFINITION. A projection Pon an inner product space X is said to be 
orthogonal if its range and null space are orthogonal, that is, A(P) 1 W(P). 


It follows immediately that if P is an orthogonal projection, then so is 7- /’, 
As already suggested, orthogonal projections are continuous. 


5.16.2 THEOREM. An orthogonal projection is continuous. 


Proof: Since P is a projection, each x in the inner product space XY can be 
uniquely expressed x =r+n, withre &(P) and ne V(P). Since P is orthogonal, 
we have r L n. It follows from the Pythagorean Theorem that ||x||? = |Ir|[? + ln’, 
so ||Px||? = ||r||? < ||x|/? and P is continuous. (We leave it to the reader to show thiut 
if P40, then ||P||=1.) J 


Notice that the last theorem is valid even if X is not complete. 

The next theorem shows that if we start with an orthogonal projection on an 
inner product space, the geometric situation is without surprise and agrees with 
intuition. 


5.16. ORTHOGONAL PROJECTIONS 301 


5.16.3 THEOREM. Jf P is an orthogonal projection on an inner product space X, 
then 


(1) W(P) and &(P) are closed linear subspaces, 

(2) W(P) = &(P)* and &(P) = N(P)-, 

(3) each xe X can be written uniquely as x =r-+n, where re &(P) and 
ne WN(P), and 

(4) |[xl]? = Irll? + Ill’. 


Proof: The proof of (3) follows directly from the fact that P is a projection 
(Theorem 4.11.2). Similarly, (4) follows from the fact that P is orthogonal. State- 
ment (1) follows from the fact that W(P) and &(P) are the null spaces of the con- 
tinuous operators P and J — P, respectively. 

The proof of (2) is relatively straightforward. Since (P) 1 &(P), it is clear 
that (P) < &(P)t. Now, we want to show that (P) > &(P)*, and hence, that 
N(P) = &(P)*. Let x be any point in @(P)*. Then there exists a unique r, € A(P) 
and n, € W(P) such that x =ro +m. Since x € A(P)*, one has (x,r) =0 for 
all re A(P). Then 0 = (ro + %o.r) = (ror) + (Mor) =(ro,r) for all re AP), in 
particular, for r=ro. Hence, ro=0 and x=n,€NV(P), which shows that 
N(P) > &(P)*. A similar argument shows that A(P)= V(P)*. ff 


Next let and WN be linear subspaces of an inner product space X and assume 
that Ml Nand ¥=M+N. Since Mo N = {0}, it follows from Theorem 4.11.3 
that there is a unique projection P with range @(P) = M and null space W(P) = N. 
It follows then from Definition 5.16.1 that P is an orthogonal projection. 

So far we have not used completeness. 

Suppose now we are given a single linear subspace M of an inner product 
space X. Can we find an orthogonal projection P with the property that the range 
R(P) is precisely M? If we can, then M is the null space of the orthogonal projection 
[ — P and, consequently, it is closed. So we obviously have to assume that M is 
closed. The following result gives an affirmative answer in the case of a Hilbert 
space. 


5.16.4 THEOREM. (THE PROJECTION THEOREM: SECOND VERSION.) Let M beany 
closed linear subspace of a Hilbert space H. Then there is one and only one orthogonal 
projection P with A(P) = M. 


Proof: By the first version of the Projection Theorem (Theorem 5.15.6) we 
have 


H=M+M~* 


and every vector x in H can be written uniquely as x = m+n, where me M and 
ne M+. Now define P: H > H by 


Pim+n)=m. 


302 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


It is easy to check that P is an orthogonal projection with A(P) = M. Next let us 
show that P is unique. Suppose P is another orthogonal projection with A(P) = M. 
Since P is orthogonal, one has W(P) = M+. Then 


Pm =m = Pm, me M, 


Pn =0 = Pn, ne Mt, 


Hence, P(m + n) = P(m +n) forallx =m+neH,thatis,P=P. § 


It should be clear that the two versions of the Projection Theorem (Theorems 
5.15.5 and 5.16.4) are equivalent. The reason for the name “‘ Projection Theorem " 
should also be clear after the last result. 

Both versions of the Projection Theorem use the fact that H is complete in an 
essential way. Let us now show that Theorem 5.16.4 fails when the underlying 
inner product space is not complete. 


EXAMPLE 1. Referring back to Example 4, Section 14, let us assume that P 
is an orthogonal projection on X with Px = x for all x in M. We shall now show that 
P is necessarily the identity on X; in other words, XY = &(P) # M, that is, there is no 
orthogonal projection of X onto M. First we note that since P is a projection one 
has Y¥ = &A(P) + WV (P), by Theorem 4.11.2. Since Mc A(P), one has &(P)' < M! 
by Theorem 5.15.3(a). However, we have shown that M+ = {0} in Example 4, 
Section 14. Hence W(P)= &(P)* = {0}, or A(P)= NV(P)* =X by Theorem 
5.15.3(f). Jj 


99 


In summary, then, geometric intuition “works” completely for orthogonal 
projections on Hilbert spaces. However, if the space X is not complete, we have ta 
be careful. If we start with an orthogonal projection or if we start with comple- 
mentary orthogonal subspaces, then there is no difficulty. But, as we have just scen, 
if we start with a closed linear subspace M in an incomplete space, there may not 
be an orthogonal projection P with @(P) = M. 

Let us return now to the approximation result of Theorem 5.14.4. Another 
way of formulating this result 1s to say that there is a mapping of the Hilbert space 
#7 into the closed linear subspace M (mapping x, onto yo) with certain properties. 
We show now that this mapping 1s, not surprisingly, the orthogonal projection of // 
onto M. 


5.16.5 THEOREM. Let M be a closed linear subspace in a Hilbert space H, and 
let x) € H. Further, let P be the orthogonal projection with @(P) = M. Then 


[Xo — Pxoll < ||xo — yl (5.16.1) 


for all y# yo =PxXo in M. That is, ||\x9 — Pxo|| = inf{\|xo — yl: ye M}. (See 
Figure 5.16.1.) 


5.16. ORTHOGONAL PROJECTIONS 303 


et” my 


Figure 5.16.1. 


Proof: First we note that x9 — Pxp L M. (Why?) Therefore, 
Xo —Pxo L Pxo -—y 
for all yin M. Hence, by the Pythagorean Theorem we get 
Xo —y|l’ = ||Xo — Pxo + Pxo — yl? = ||Xo - Px + ||Pxo — yl? Zo Pxg\-: 
As a matter of fact, the inequality is strict unless y= yp = Px,. J 


Let us consider some examples of projections. 


EXAMPLE 2. Consider the space L,(—a,a) where 0 <a< o. Let M denote 
the collection of all even functions, that is, x isin M if and only if x(—t) = x(t), 
almost everywhere. It is easy to see that M is a linear subspace of L,(—a,a). Define 
u mapping P: L, > L, by y = Px, where 


Y(t) = 2Lx(t) + x(—1)]. 


It is clear that M = &(P) and that Px = x for x e M. Hence P is a projection. The 
null space of P is characterized by x € W(P) if and only if x(t) = — x(—1), almost 
everywhere, that is, x is an odd function. Recall that if a function z is odd and also 
in L,(—a,a), then J*, z(t) dt = 0. But if x e A(P) and ye V(P), then xy is odd and 
in L,(—a,a), so 


a 


(x,y) = | xy dt = 0, 


that is, x L y. We have shown that @(P) 1 W(P); hence, P is an orthogonal 
projection. J 


EXAMPLE 3. Consider the space L,(/), where / is any interval and let A be 
a measurable set in /, forexample, A may be a subinterval. Define P, by y = P, x, 
where 


_ fx(t), te A, 
Wt) = * t¢é A. 


304 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


It is easy to see that P, is a projection, in fact it is an orthogonal projection with 
range 
AP.) = {xeL,(1): x(t) = 0, for t € A} 
and null space 
N(P,) = {x EL): <t)=O0forte A}. J 


EXAMPLE 4. It is shown in the exercises in Section E.5 that the conditional 
expection operator E® is a bounded projection of L,(Q,F,P) into itself. Recall 
that if (Q,4,P) is a probability space and if @ is a sub-o-field of F, then 


I, E°[X] dP, = ie X dP 


for all Be Z. Let us show that E® is an orthogonal projection on L,(Q,F,P). First 
we recall (Exercise 11, Section E.5) that if X, Ye L,(Q,F,P), then 


E*[E2[X]Y] = E°[X JE Y]. 


Thus if Ye W(E®), then Y 1 E#[X] for all X in L,(Q,F,P), that is, W(E®) 1 AE"). 
Hence E® is an orthogonal projection. 
We note in passing that one can easily show that 


R(E*) = L,(Q,B,P a). A 


EXERCISES 


1. Contrast the geometric properties of Hilbert spaces with those of incomplete 
inner product spaces. 

2. Let M" denote the space of n x n matrices with complex coefficients and assume 
that M" has an inner product given by 


(A,B)= > a,; bj; = trace B' A. 
i j=1 


Let A e M" and let {/,,...,A,} denote the eigenvalues of A, that is, all complex 

numbers / with the property that det(AJ — A) = 0. Let I’ be any simple closed 

curve in the complex plane that does not meet any of the eigenvalues {/,,.... Ay]. 

(a) Show that P, =1/27i {, (AI — A)~' dd is in M". (Pr is defined as the limit of 
the Riemann sums 


* 1 
cae = La I— A] (A; — 4;-1), 
where {1,,A,,...,A, =Ao} iS a partition of the curve [ and /,* is a point 
on the arc 4,;_,A;. The limit is taken as the arc lengths |A;_,/;| tend to zero ) 
(b) Show that P; is a projection on C”. 
(c) Show that P; = 0 if does not enclose any of the eigenvalues {/,,....A,}. 
(d) Show that P; = /if I encloses all of the eigenvalues {/,,...,A,}. 


5.17. ORTHONORMAL SETS AND BASES 305 


3. Let {P,,...,P,$ be a collection of orthogonal projections with P;P;=0 for 
LF]. 
(a). Show that OQ =P, +-+-+P,, is an orthogonal projection. 
(b) What happens if one drops the assumption that P;P; = 0 fori # j? 

4. Let M and N be closed linear subspaces of a Hilbert space H, and let P and O 
denote the orthogonal projections onto M and N, respectively. We say that the 
ordered pair (M,N) is compatible if 


(MaAN)+(MaAN*)=M. 


(a) Show that (/,N ) is compatible if and only if (VN, ) is compatible. 

(b) Show that (M,N) is compatible if and only if P and OQ commute. 

(c) Give an example of a pair of incompatible spaces. (This concept is important 
in quantum mechanics, see Jauch [2, pp. 80-86].) 


17. ORTHONORMAL SETS AND BASES: GENERALIZED FOURIER SERIES 


Orthonormal Sets 


We discussed Hamel bases, a purely algebraic concept, in Section 4.7: Every- 
thing said there is, of course, applicable to Banach and Hilbert spaces. In the finite- 
dimensional case that is almost all that needs to be said. However, with topological 
structure now present, we have new opportunities open to us. In particular, it is 
now possible to attach meaning to infinite linear combinations of the form 


i.) 
» Oj Xj 
i=1 


whereas in Chapter 4 we were limited to finite linear combinations. Thus we have 
the possibility of introducing types of bases which involve topological as well as alge- 
braic structure. Without question the most useful such concept is that of an 
orthonormal basis in a Hilbert space. We will study it in this section and the next. 
We start with orthogonal and orthonormal sets. 


5.17.1 DEFINITION. A set of points {x,}in an inner product space X is said 
(o be orthogonal if x, 1 xg whenever a # B. 


It is possible for a set of orthogonal points to contain the origin 0, since 0 is 
orthogonal to any point in X. 


5.17.2 DEFINITION. A set of points {x,} in an inner product space X is said to 
be orthonormal if 


(x, ,Xg) = Oup 


for all « and #, where 6,, is the Kronecker function. 


306 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


In both these definitions the index « may range over a finite, countably infinite, 
or uncountably infinite index set. Note that any orthogonal set of nonzero points 
{x,} can be changed into an orthonormal set by replacing x, with x,/||x,|l. 


5.17.3 THEOREM. Let {x,} be an orthonormal set of points in an inner product 
space X. Then {x,} is linearly independent. 


Proof: Let {x,,...,x,} be any finite set from {x,} and consider 
OX, +°'' +4,x, = 0. (5.17.1) 


We want to show that the only solution of (5.17.1) isa, =--- = «, = 0. Let us show 
that «, = 0. Indeed, 


O = (0,51) = (yxy Hot + Oy Xq X 1) = 04 (% 1%) Hort + OH, X1) = %, 


because (x;,x,) = 6;, for | <i<n. Similarly, we geta, =---=a,=0. J 


5.17.4 DEFINITION. We shall say that an orthonormal set B= {x,} in an 
inner product space XY is maximal'® if there is no unit vector x, in X such that 
B vu {xo} is an orthonormal set. 


Another way to say the same thing 1s given in the next lemma. 


5.17.5 LemMA. An orthonormal set B = {x,} in an inner product space X Iv 
maximal if and only if x 1 x, for all « implies that x = 0. 


The first thing to be said about maximal orthonormal sets is that there are 
plenty of them available. In fact, one can prove the following fairly strong resull, 


5.17.6 THEOREM. Let {x,} be any orthonormal set in an inner product space X, 
Then there is a maximal orthonormal set B in X with {x,} ¢ B. 


The proof of this theorem is a straightforward application of Zorn’s lemmu 
(Appendix C) and we shall only outline it here. Let {x,} be an orthonormal set 
in X, and consider all orthonormal sets in XY that contain {x,}. Order these sets in 
the obvious fashion and then verify that Zorn’s lemma can be applied. 

The main result in this section (Theorem 5.17.8) is that maximal orthonormal 
sets in Hilbert spaces have a natural geometric interpretation as orthogonal co 
ordinate systems, or generalized Fourier series. Completeness is important here so 
we give maximal orthonormal sets in complete spaces a special name. 


5.17.7 DEFINITION. A maximal orthonormal set B in a Hilbert space // is 
referred to as an orthonormal basis for H. 


10 Many authors use the term ‘‘complete orthonormal set.’ We will not do this here since it seen 
preferable to reserve the use of the term ‘‘ complete” for the Cauchy sequence property in mettt 
spaces. 


5.17. ORTHONORMAL SETS AND BASES 307 


It follows from Theorem 5.17.6 that any orthonormal set in a Hilbert space 
can be extended to form an orthonormal basis. It can also be shown that any 
(wo orthonormal bases of a Hilbert space have the same cardinality, see Exercise 6. 


The Fourier Series Theory 


The next theorem, then, states the fundamental properties of orthonormal 
bases. Carefully note how topological as well as algebraic structure comes into 


play. 


5.17.8 THEOREM. (FOURIER SERIES THEOREM.) Let {x,} be an orthonormal set 
in a Hilbert space H. Then the following statements are equivalent: 


(a) {x,} is an orthonormal basis. 
(b) (Fourier series expansion.) For any x in H one has 


ey OCK, ac 


(c) (Parseval Equality.) For any two vectors x and y in H one has 


(2,9) = Zo CoxW)%): 
(d) For any x in H one has 
|x|]? = oe I(x.x,)*- 
(e) Let M be any linear subspace of H that contains {x,}. Then M is dense in H. 


The coefficients (x,x,,) in the series expansion x = ) _,(x,x,)x, are often called the 
Fourier coefficients of x. 


This theorem will be proved momentarily. However before doing this we need 
three preliminary results. The first of these is the Bessel Inequality. The second is a 
discussion of the convergence of ) ,, a, x, when {x,} is an orthonormal set. The third 
result gives a formula for computing the values of an orthogonal projection P in 
terms of an orthonormal basis in the range A&(P). 

In order to slightly simplify the following discussion, let us assume for the 
moment that the orthonormal sets and bases we consider are countable. The un- 
countable case can be quickly handled afterwards. In fact, we shall show that the 
theorems that follow are also meaningful for the uncountable case. 


5.17.9 LEMMA. (THE BESSEL INEQUALITY.) Let {x,} be an orthonormal set in 
an inner product space X. Then for any x in X one has 


DY MCxexyl? S [xl (9.17.2) 


308 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Proof: Consider the finite subset {x,,...,x,} from {x,}. Then one has 


0<\|Ix— 2 (xx); = [=~ 2 (25%, )*8~ 2 (%%))%, 
= (x,x)— 2 (x,x;)(%;,.x) — 2 (x, x,)(x,x,) 


+ >) > (x,x,)(x,x,)(x;.x,) 


i=1j=1 
N 


=IIxll?— & (xx), 
] 


since (x;,x,;) = 6;;. Therefore (5.17.2) holds for finite sums. Since the right side 
of (5.17.2) does not depend on N we see that it also holds for countable sums. J 


Next let us take a careful look at series of the form )’, a,x, where {x,} is an 
orthonormal set. 


5.17.10 Lemma. Let {x,} be a countably infinite orthonormal set ina Hilbert 
space H. Then the following assertions are valid: 


(a) The infinite series ) ?., %,X, (where the «,’s are scalars) converges if and 


only if the series of real numbers S°°., |o,|7 converges. 
(b)'* Assume that )°°_, &,x, converges and let 


oO [e 6] 
7 ee Bae Rie Aa Oo Pe 
n=1 n=1 


Then a, = B, for all n and ||x\\?7 = Y°2, |a,|?. 
Proof: 
(a) Suppose that )°° ,, x, is convergent and let x =). ,a,,x,, that is, 
N 2 
lim |x — >} a,x, || =0. 
N->o n=1 


Then, since the inner product is continuous, one has 


(x) =| > on = 3 4,(x,.%;)=a,;, — for all). 


n=1 n=] 


Then from Bessel’s Inequality one gets 


— 2 — 2 2 
Y Ix)? = ¥ loal* < xl’, 
n=1 n=1 


which shows that )., |x,|* converges. [Note: We do not need completeness for this 
part of the proof.] 


11 The completeness of H is not crucial for the proof of statement (b). 


5.17. ORTHONORMAL SETS AND BASES 309 


Next suppose that 2°. ,|@;|* converges, and let s, = X”_,a,x;. It follows that 


nm 


, 2 
isn — Sil? = lol’, 


i=m+1 
40 {s,} iS a Cauchy sequence. Since H is complete, the sequence of partial sums 


(s,} 1s convergent. This completes the proof of statement (a). 
(b) Let us first prove that ||x||* = ¥, |a,|? 


N N 
x2 — Yel? es (x. - y 5%) + (x — VO Xns don *, 
n= = 


n=1 

N N 
x— > a,X, SGX, 
n=1 n= 1 


N 


== » bn Xn 


n=1 


xi + 


< 2I|x| 0. 


Hence |x]? = > ,]o,17. 
Now ifx=) 1 0,x, =), B,x,, then 


N 

0 =lim be Xa— x B, *] =lim > (%, — B)Xn> 
N-> oc n=1 N>o n=1 

or0=>)2, (a, — B,)x,. By the last paragraph we see that 07 =), |e, — B,|7 or 

a, =B,foralln J 


Since the use of orthonormal bases involves infinite series, one does have to 
consider the possibility that the convergence of these series may converge only 
conditionally. The next corollary shows that this is not an issue. 


5.17.11 COROLLARY. Let {x;} be a countably infinite orthonormal set in a 
Hilbert’? space H. Then the infinite series ) 7%, %;x;, where the as are scalars, is 
convergent if and only if it is unconditionally convergent 


Proof: If the series is unconditionally convergent, it is certainly convergent. 
On the other hand, a simple application of Lemma 5.17.10 shows that if } a; x; is 
convergent, then any rearrangement of this series is convergent. We leave it as 
wn exercise to show that the limit is independent of rearrangements. J 


The next thing we wish to look at is the Projection Theorem in terms of an 
orthonormal basis. More specifically, if {x,} is an orthonormal set in a Hilbert 
space H, we wish to show that the formula 

Pre). (ax) 
n 
defines an orthogonal projection on H. 

Consider first the finite-dimensional case. Let B = {x,,x,,...,x,} be a finite 
orthonormal set in H and let M be the linear subspace spanned by B. Then M is 
closed because it is finite dimensional. Furthermore, we know from the Projection 


'2 Completeness is not crucial here. 


310 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Theorem (Theorem 5.16.4) that there is a unique orthogonal projection P on H with 
RP) = M. 


5.17.12 LemmMa. Let B= {x,,x,,...,xX,} be a finite orthonormal set in a 
Hilbert space H, and let M be the finite dimensional linear subspace of H spanned by 
B. Then the orthogonal projection of H onto M is given by 


Px = > (x.x)x;. 
1 


w= 


Proof'*: It is obvious that P is a linear mapping of H into itself. Now let 
x be any point in H. Since 


n n 
Px; = \ (x; X)X; = D 54j% => X jo 
1 i= 


i= 


one has 


P?x = P( . (x,x)x) = y (x,x;) Px; 


i= 1 i=1 


=) (40, Xp eP x. 
i=1 


Hence, P is a projection. Moreover, it is obvious that @(P) < M. Conversely, if 
xéeM, then x =a,x, +0,x,+°':+4a,x,, and it can be seen that Px = x, so 
RP) = M. 

Next we show that P is orthogonal. Let ye W(P) and xe &(P). We want to 
show that (y,x) = 0. Since x € &(P), we have x = Px and 


(y,x) = (y,Px) = (y Y Gexxi 
=¥ OG) = OI) 


= (3: oxox x) = (Py,x) = (0,x) = 0. 
So P is orthogonal. J 


5.17.13 COROLLARY. Let {x,,...,x,} be an orthonormal set in a Hilbert space 
H and let x € H. Then for any choice of complex numbers {c,,...,C€,} one has 


< 


(5.17.3) 


n n 
x — Y (x,x,)x; x—> ¢;x; 
i=1 i=1 


Proof: This corollary ts a direct consequence of Theorem 5.16.5 and the last 
lemma. Jj 


13 The reader should reconsider this proof after the adjoint operator is introduced in Section 2? 
Also notice that the completeness of H is not really needed here. 


5.17. ORTHONORMAL SETS AND BASES 311 


Now let us turn to the infinite-dimensional case. 

Let {x,} be an orthonormal set in a Hilbert space H. The closed linear sub- 
space M of H generated by the {x,,} is defined to be the closure of the linear subspace 
generated by the {x,}. That is, first form the linear subspace generated by the 
{x,} in the sense of Section 4.6, and then take the closure of this space in the sense 
of Section 3.12. Recall that the closure of a linear subspace is a closed linear sub- 
space (Theorem 5.5.2). 


5.17.14 Lemma. Let B = {x,} be a countable orthonormal set in a Hilbert'* 
space H, and let M be the closed linear subspace generated by the set B. Then every 
vector x € M can be written uniquely as 


x=). (%%,)% (5.17.4) 


Moreover, the mapping P defined by 
Px =) (x,x,)X, (5.17.5) 


is the orthogonal projection of H onto M. 


Proof: Itis clear that any vector of the form (5.17.4) is in M. The problem is 
to show the converse. So let xe M. Then x = limy.,,, yy, Where each yy is a finite 
linear combination of vectors in B, that is, yy = )}-1%,X,, where the «,’s and K 
depend on N. By adjoining additional terms if necessary, we can assume that 
K > N. It follows from Corollary 5.17.13 that 


< 


K 
x — Don Xn = \| x > ynil . 
n= 


K 
X — ) (X,%p)Xy 
1 


n= 


Since ||x — yyl| > 0 as N > oo, this implies that x = 5’, (x,x,)x,. The uniqueness of 
(5.17.4) follows from Lemma 5.17.10 (b). 

Let y = Px, where P is defined by (5.17.5). By using Lemma 5.17.10 and the 
Bessel Inequality we get 
Z 


| Px? = =) ICxx,)1? < [x17 


Dd, (%%n)% 


It follows that P is continuous. Then the proof that P is an orthogonal projection is 
essentially the same as the proof of Lemma 5.17.12. One merely calls upon the 
continuity of P and the continuity of the inner product to justify the interchange 
with summations. J 


Let us now prove the Fourier Series Theorem. 


Proof of Fourier Series Theorem: 
(a) => (b). Assume that {x,} 1s a maximal orthonormal set in H and let M be 
the closed linear subspace of H generated by the set {x,}. If xe M+, then x L x, 


'* Completeness is not crucial here. 


312 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


for all n. But {x,} is maximal, so x = 0 and M+ = {0}. It follows from Theorem 

5.15.4(c) that M = H. (Here we are using the completness of H!) Therefore, the 

orthogonal projection onto M is the identity map J and by Lemma 5.17.14 we have 
Ix =x =) (x,x,) x, 


n 


for every x in H. 


(b) => (c). Let x = 5), (x,x,)x, and vy = Vn (VX n)Xm» Since (X_ Xm) = Onm and 
the inner product is continuous we get 


(x,y) = ( (xxx, 0 (92%) 
as » y (X,%,)(VsXm Xp xe) 
=F (x,x, Xp): 


(c)=>(d). This is obvious. 


(d) => (a). If {x,} is not maximal, then there is unit vector x, such that 
Xo U {x,} 1s an orthonormal set. Using (d) and the fact that (xo,x,) =0 for all 
n, we get 


1 = Ilxoll7 = »y I(Xo Sle == 0, 


a contradiction, 
(b)<>(e). Statement (e) says that the orthogonal projection onto M is the 
identity, and by Lemma 5.17.14 we see that this is equivalent to statement (b). J 


The following corollary is now a simple exercise. Carefully compare it with 
Lemma 5.17.14. In particular, the following result requires completeness whercin 
Lemma 5.17.14 does not. 


5.17.15 COROLLARY. Let M be any closed linear subspace of a Hilbert space 
Hand let {x,} be an orthonormal basis for M. Then the orthogonal projection P of 
H onto M is given by Px =), (X,Xn)Xq- 


The Gram-Schmidt Process 


Given any countable linearly independent set {y,} in an inner product space, tt 
is always possible, in principle at least, to construct an orthonormal set from it. Tho 
construction, which is called the Gram-Schmidt orthogonalization process, 1s rather 
important, so we shall describe it here. Given the set {y,} we propose to construct an 
orthonormal set {x,,} with the property that x, is a linear combination of the vectors 
Vis V25--+5 for k = 1, 2,3,.... This is done by induction. Let x, = y,/|ly, |]. 


X1,..., X, have been determined, we define x,., by 
k 
Xe+1 = al Yee —¥ Oneida, (5.17.0) 


5.17. ORTHONORMAL SETS AND BASES 313 


K 
VA +4 =U t 1 XX; 


ih 


Subspace MM, spanned 
bY ¥1.--- {1% OF, 
equivalently, by x,, 

» Xz 


k 
ye (iy 4 ».X)X;= projection / 
‘=1 of yy 4 , onto M, 


Figure 5.17.1, 


where « is a scalar chosen so that ||x,4,|| = 1. This induction step is illustrated in 
Figure 5.17.1. We leave it to the reader to show that we do indeed generate an 
orthonormal set in this way. 

We mention in passing that the Gram-Schmidt orthogonalization process can 
sometimes lead to a very sensitive computation. The difficulty arises when the 
vectors in the original linearly independent set {y,} get very close to being collinear. 
In that case, the number « can become very large. 


EXAMPLE 1. In C? with the usual inner product, let y, = (1,0) y. = (1,e) 
where ¢ > 0. Then x, = y,; and x, = a[y, — (¥2,x,)x,] = a(0,e). Hencea=I/e. J 


EXAMPLE 2. In /, with the usual] inner product, let 


Yn = Vin Vans +)s Wendl, 2544 


be given with y,, = 1, for | <i<n, y,44,, =, where 0 <e< 1, and y,, =0 for 
n+2 <i. It follows that x, = (6,;,6,,...), however, the « in (5.17.6) now becomes 
« =e *, This phenomenon can and does cause a problem in computer applications. 
The reason for this is that the angle between y, and y,,, becomes very small as n 
tends to o. Jf 


So far we have been restricting ourselves to Hilbert spaces with countable 
orthonormal bases. The next theorem shows that such spaces possess a very useful 
topological property: They are separable. 


X2 


Y 


Figure 5.17.2. 


314 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


5.17.16 THEOREM. A Hilbert H has a countable orthonormal basis if and only 
if it is separable. 


The proof of this is outlined in the exercises. 
Because of Theorem 5.17.16, one usually refers to Hilbert spaces with countable 


orthonormal bases as separable Hilbert spaces. 


Uncountable Bases 


The next obvious question is “‘what can we do with nonseparable spaces?" 
Our basic problem is to say what we mean by the series 


Y I(xx)/7 and > (x,x,)xXq (5.17.7) 


when the orthonormal set {x,} is not countable. It turns out that (5.17.7) has a very 
straightforward interpretation. This 1s shown with the aid of the next result. 


5.17.17 LEMMA. Let {x,} be an orthonormal set in an inner product space X, 
and let x be any point in X. Then (x,x,) is nonzero for at most a countable number 
of X,'S. 


In other words, if {x,} is uncountable, then *“‘ most” of the coefficients (.v,.,) 
are zero. Needless to say, this result may be somewhat of a shock to geometri¢ 
intuition. 


Proof: Let x be any point in X, and let A denote the set of all x, such thiat 
\(x,x,)| > 0. Then let 


A, = {X_€ A: |(x,x,)|? > [xIl?/n}, 


where n= 1, 2,3,.... It follows from the version of Bessel’s Inequality for 
countable orthonormal sets already proved, Lemma 5.17.9, that A, contains at 
most (m — 1) vectors. Since A = | J-~, A,, it follows that A is at most countably 
infinite. J 


Because of this lemma we see that both series in (5.17.7) contain at most only a 
countable number of nonzero terms. We now can define when the two series in 
(5.17.7) converge. Specifically let {x,} be an orthonormal set where « ranges in some 
index set A. We shall say that’? 


x= s (X,Xq)Xq = ye (X,Xq)Xq 
aeadA a 


15 Tt should be noticed that this definition of convergence is different from that defined in Section 
5.4, even in the case where the index set A is the set 1, 2,.... For example, the new definition of 
convergence automatically implies unconditional convergence. On the other hand, we only apply 
this new definition to orthonormal sets so the difference is more apparent than real, see Corollary 
5.17.11. 


5.17. ORTHONORMAL SETS AND BASES 315 
provided for every ¢ > 0 there is a finite subset E < A with the property that 


<€é 


x — >» (X,X_)Xq 
aeF 


for all finite sets F < A with E c F. The definition of the convergence of 5°, |(x,x,)|7 
is similar. 


On the basis of these definitions one can go back through this section and show 
the following for nonseparable spaces: 


(1) Bessel’s Inequality (Lemma 5.17.9) is still valid. 


(2) Lemma 5.17.10 would be meaningless as stated, but one could say that 
the series }, 8, x, converges if and only if B, is nonzero for at most a countable 
number of «’s and 9’, |B,|7 < ©. 


(3) Lemmas 5.17.12 and 5.17.14 are modified in the same spirit. 


(4) Finally the proof of the Fourier Series Theorem carries over in exactly the 
same way. The same can be said for Corollary 5.17.15. 


In short, nonseparable Hilbert spaces offer no problems once Lemma 5.17.17 is 
available. 


Orthonormal Bases versus Hamel Bases 


We now have two kinds of bases for Hilbert spaces: Hamel bases and ortho- 
normal bases. Here are a few remarks that one can make about the two: 


(1) Hamel basis is a purely algebraic concept and it involves only finite 
linear combinations. One rarely uses Hamel bases in infinite-dimensional Hilbert 
spaces. 


(2) Orthonormal basis is a combined topological and algebraic concept and it 
allows countably infinite linear combinations. Orthonormal bases are extremely 
useful in infinite-dimensional Hilbert spaces. 


(3) Any orthonormal basis of a Hilbert space H is a subset of a Hamel basis 
for H. (Why ?) 


Series with Nonorthogonal Entries 


This entire section up to this point has been devoted to orthonormal bases 
for Hilbert spaces. By this point the reader should appreciate their “‘clean’’ and 
simple structure. Unfortunately, there are cases where one must abandon this 
simplicity. Suppose that {x,} is a countably infinite linearly independent set in a 
Hilbert space H. We do not assume that {x,} is orthonormal or even orthogonal. 
Let M be the linear subspace of H spanned by {x,} in the sense of Section 4.6, that 
is, M consists of all finite linear combinations of points in {x,}. Next let M be the 
closure of M. All we can say in general about a point y € M is that there exists at 


316 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 
least one sequence {S,} in M, where each Sy is a finite linear combination of points 
in {x,}, such that y = lim S,. Equivalently, there are scalars «,,; such that 


y = lim (ay, X, + Oy2X_ + °° + Gyn Xn’). 


N~>o 
Naturally, if ze H and 
Z = Bix, + Bp x2 + B3x3+°°°, 


then z is in M. The very important point to be made here is that the converse is not 
necessarily true. That is, if y is a point in M, it does not follow that there exists a 
convergent infinite series of the form a,x, + a x, +°°: with limit y. (Compare with 
Lemma 5.17.14.) This point is illustrated by the next example. 


EXAMPLE 3. Let {y,2,,Z,Z3,...} be an orthonormal set in a Hilbert space 
H. Then construct the linearly independent set {x,} by 


1 1 
= (c0s=)y + (sin), | es ne re 
n n 


Since y = lim,.,,, x, , it follows that y e M, where M is defined above. Now sup- 
pose that y can be written in the form 


VHX, + O,X24+°°°. (5.17.8) 


2 1 _ 4 
= cos — sin —}z 
y= Za(eorg) + (ng 
and since {y,z,,Z,,...} 18 an orthonormal set, one has 
2 1 2 1 
ee ed (60s -| y+) (sin =): 
n=] n n=1 n 


Since the two terms on the right-hand side are orthogonal to one another it follows 
that 


Then 


2 1 
¥ a,(08 | = | 
and 
y OL (sin “)z = 0 
Ln Pi a 
But then Lemma 5.17.10(b) implies that 


as 1 
y* |a,|7 sin? — = 07 = 0, 
n=1 n 


so0 =a, =a, = a3 =°'':. This implies that y = 0 which is a contradiction, so 1 
cannot be expressed in the form (5.17.8). J 


5.17. ORTHONORMAL SETS AND BASES 317 


This example illustrates a problem. In particular, we would like to know when 
every point in M can be expressed in the form (5.17.8). Needless to say, Lemma 
5.17.14 gives one case. The next theorem gives a more general case. 


5.17.18 THEOREM. Let {x,} be a countable linearly independent set in a Hilbert 
space H. Further assume that there are D> 0 and 6 >0 such that 


67(\Bi1? + [Bol? +°°* + [Bwl*) S Bix, +°°° + By xnll’ (5.17.9) 
Bix, ++°° + By xyll? < D7((Byl? +--+ + [Byl’) (5.17.10) 


for all scalars B,, B,..., By and N = 1, 2, 3,.... Then each y € M, where M is the 
closed linear subspace of H defined above, can be expressed uniquely in the form 


VHX, $H2X2+°"',= 


where the «;’s are scalars. Moreover, the coefficients a, are continuous linear functions 
of y. In fact, denoting these functions by a, = 1,(y), there exists a constant B > 0 such 
that ||1,|| < B for all n. 


Note that Example 3 contains a violation of (5.17.9) because, for example, 
Xn i Xneall? +0 asn- o. 

Before proving this theorem it is interesting to determine the geometric signifi- 
cance of conditions (5.17.9) and (5.17.10). To begin, we note the D > ||xy|| > 6 for 
N = 1, 2,...,sothat the x,’s neither get arbitrarily large nor arbitrarily small. It is 
more interesting to note that (5.17.10) is a kind of orthogonality condition. 
Let z be any unit vector in H. Then each x, can be uniquely expressed as 
Xn = YnZ + W,, Where w, is orthogonal to z and y, is a scalar. Then 


Bix, +-7+ + By xyll? = "Birr + 27° + By yn)Zil? + Biws ++ + By wyll? 
> (Biv +77 + By ywl? [lzll? 
= (Biv. too + Badal? 
and from (5.17.10) it follows that 


D(|By|? +°°* + [Bwl?)/? = IBiny +°°° + By yn 


for all £,’s and N. But this implies (Why?) that |y,|* + |y.|* +++: < oo. Hence, 
VY, 70 as n— oo. Since y, = (x,,Z), we see that x,’s get closer and closer to being 
orthogonal to z. Moreover, z was any unit vector, so that the x,’s ‘““swing away” 
from any given direction. Orthonormal sets, of course, do the same thing and more. 

Among other things, Inequality (5.17.9) shows that ||xy — xy||* > 26? for all 
N, M. That is, the x,’s cannot get arbitrarily close together. Indeed let A = 
{XnioXngo+ ++ Xnys De any finite subset of {x,}, and let x, be a point in {x,} that is not 
in A. It follows from (5.17.9) that 


Il x, ~~ CiXn, Say Pe nee lee sem la = é7(1? a lees ae Oa lene ye 


for all scalars c,, C2, ..., Cy. In other words, x, is a uniform positive distance away 
from the finite-dimensional subspace spanned by A. 


318 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Proof of Theorem 5.17.18: Let y be any point in M, and let y = limy..,, Sy, 
where Sy = @y1X1 +°°*' + &yy-Xy- and N’ depends on N. Without any loss ol 
generality we can assume that N< WN’ and that N’ < M’ whenever N< M. (One 
need only add extra terms with the a’s set equal to 0, if necessary.) Since {Sy} in 
a convergent sequence in H, we have 


Sy — Sagll = [loys tt Oye Xn — Oa XX Xm'll 7 9 
as N, M— oo. It follows from (5.17.9) that for N > M one has 


lows — Omal? + lowe — Om2|* +°"° : ; 
+ Jona: — Omael + lena sal too + lown-|? 0 


as N, M— oo. But then the sequence {ay}, where 
ayn = (On1:4n2 gee sONN’ ,0,. . ) 


is a Cauchy sequence in /, , so there is a point dp = (1,092 ,%3,---) in 7, such that 
lim yo Ay = A). We claim that 


Y= Woy X%, + Xp2h%2 + M3 %X3 + °°" 
Indeed, if 

Zn = OqyXy + A272 X2 + °°* + Aon Xy’> 
then 

lly — Zyll = lly — Sy + Sy — 2yll S ly — Syl] + Sy — Zyl. 
We know that ||y — S,y|| ~ 0 as N— oo. Moreover, from (5.17.10) we have 
Sy — Zyl]? = [ows — Cor), + °° + (Onn — ton )Xnill” 
< D*{loyy — aol? +°°* + lowe — Cowl}. 


Since limy.,, @y =@, Wwe conclude that ||Sy —zy|| 0 as N- oo. Hence, 
lly — Zyl +0 as N- OO, and y = 1X, + Xo2X2 ees 
Next let us show that this series expression of y is unique. Suppose that 


Y = UgyX + LyX +70 = yxy + PaX2 + °°. 
Then 
log Xy tot + Cony Xn — Bix, — °° — By Xyl| 70 as N— oo. 
But from (5.17.9) we have 
Wo, — Brey +°°* + (Con — By)xwll? = 57(lto1 — Bil? +°°* + low — Bul?) 


for all N. This implies that a); = 6;, = 1, 2,.... Hence the series representation in 


unique. 
Finally, we show that the linear functionals /, are continuous. Let y and z be 


any two points in M, with y = B,x, + By. x2 +°*: and Z=yyx, + y2X%2 +°°°. Then 
lly — zl? =n M(B, — ¥1)%1 +++ + (By — yw) Xl? 


5.17. ORTHONORMAL SETS AND BASES 319 


because of the continuity of the norm. It follows from (5.17.9) that 


SO 


lly — z]/7 = 6° {1B, — yl? +°°° + [By — ynl?3 


1 
Bn— Me Ss lly—zl, n=I1,2,.... 


Hence, ||/,|| <1/6= Bforalln. JJ 


EXERCISES 


l. 


nb WwW NY 


(a) Find an example of an infinite series in /, that is convergent but not 
absolutely convergent. 

(b) Show that the collection of absolutely convergent sequences in /, forms a 
dense linear subspace of /,. 


. How do Theorem 5.4.3 and Corollary 5.17.11 differ? 

. Show that one can drop the assumption of completeness in Corollary 5.17.11. 
. Prove Theorem 5.17.6. [Hint: Use Zorn’s Lemma.] 

. Show that every orthonormal basis in a finite-dimensional inner product space 


is also a Hamel basis. 


. Let B, and B, be two orthonormal bases of a given Hilbert space H. Show that 


there is a one-to-one mapping w of B, onto B,. (This means that B, and B, 

have the same cardinal number. This cardinal number is called the Hilbert 

dimension of H.) 

(a) Show that for finite-dimensional spaces H, the Hilbert dimension of H 
agrees with the (ordinary) dimension. 

(b) What happens in infinite-dimensional spaces? 


. Let {e,:a¢A} be a maximal orthonormal set in a Hilbert space H. Let 


{f,: B € B} be another orthonormal set in H such that for each « in A there ts a 
B in B such that 


le, —fpll? <4. 
(a) Show that B is uniquely determined. 
(b) Show that there is a one-to-one map of A into B, and that therefore cardin- 
ality (A) < Cardinality (B). [Hint: Determine the mapping by (a). Let 
B =a so that we have |le, — f, ||? < 4.] 
(c) Show that the mapping in (b) is onto B, that is, A = B, as sets. 
(d) Show that {f,: B € B} is a maximal orthonormal set for H. 


. In Example 2, show that the angle between y, and y,., tends to 0 as n tends to 


co. Use 


OV, Vnt1) 


cos 8 = ———_——_ 
nll ° Yael 


as the definition of the angle. 


320 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


a 


10. 


Prove Theorem 5.17.16. [Hint: Show that if {x,} is a countable dense set in //, 
then one can extract a maximal linearly independent subset. Use the Gram- 
Schmidt process to construct a basis. Going the other way, let {y,,} be a count- 
able basis. Show that the collection of all elements of the form 


Xnm = > (Onm a iBam)Yn 9 


finite 
where Gyms Pnm are rational, forms a countable dense set in H.] 


The following is an example of a nonseparable Hilbert space. Let AP denote the 
collection of all complex-valued functions x(t) defined for — oo <t< oo and 
with the property that 


1 T 
lim = fix? dt < 0. 
(a) Let 


(x,y) = Jim a { Oo dt. 


Show that (x,y) is an inner product on AP. 


(b) Let x be a continuous t-periodic function. Show that x ¢ AP and thut 


1 T 
Ix? == [ [x(? at. 
TO 


(c) Show that (x,y) is also given by 
1 pr — 
yy) =lm — t)y(t) dt. 
(x,y) = lim = J x(O (0 


(d) Let ¢,(t) = e'” where ais real. Show that (¢,,@,) = 5,,. (Hence AP has an 
uncountable orthonormal set.) 

(e) Let f(t) be Bohr almost periodic, that is, fis continuous and for every ¢ >- 0) 
the set E(e,f) = {te R: | f(t + t) — f(0)| < « for all t} is relatively dense in 
R, which means that there is an / > 0 such that every interval of R of length 
>/ contains at least one element of E(e,f). Show that fe AP. [Hint: First 
note that fis bounded and uniformly continuous on R. Now show that 


1 ;7 ; 
lim — t)|° dt 
lim ==] fl 
is finite, see Besicovitch [1].] 
(f) Let fe AP. Show that at most a countable number of the Fourier coefficients 
1 pt , 
C,=lm — the '*" 
o= lim 5 fs jew" at 
are nonzero. 
(g) Show that >), |C,|7 < || fl’. 


For a proof that {@,} forms an orthonormal basis for AP see Riesz-Sz. Nupy 
[1, pp. 256-259]. 


5.17. ORTHONORMAL SETS AND BASES 321 


[1. In this exercise you are asked to show that M@ + N need not be closed in H when 
M and N are disjoint and nonorthogonal closed linear subspaces of a Hilbert 
space H. (See Theorem 5.15.5.) Let H = 1/, with the usual inner product and 
let e, = (01,,02,>---)- Then {e,,e,,...} is an orthonormal basis for /,. Let M 
denote the closed linear subspace generated by {e,,e3;,e5,...}, and let N 
denote the closed linear subspace generated by {z,,z,,...}, where 


Zn = Uy Crn-1 ee eens 


where «, > 0 and a,” = 1 — 1/n?. 
(a) Show that (z, ,Z,,) = Onn and that Ma N = {0}. 
(b) Let y=), (1/n)e,,. Show that y¢ M+N but that y= lim y, where 
yn € M+ N. (Hint: Argue by contradiction and show that if y=x+2z 
where x € M and ze N, then (z,z,) = | for all n.] 
12. Let y,(t) = ¢" forn =0,1,2,...and -l1<?r<l. 
(a) Show that {y,,y,,...} is a linearly independent set in L,[—1,1]. 
(b) Use the Gram-Schmidt orthogonalization process on {yo,y,,...} to 
determine the first four vectors {x9,x,,X.,X3} in the orthonormal set 
{Xo ,X,,-.-} generated by {yo,),,...}. 
13, Using the notation of Exercise 12, the Legendre polynomials are defined by 


1/2 
t) = t), 3 a Roar 
p)=|~—| x0, 
Show that the following hold: 
d d 
—_ — = = 2 cree 
(a) eF Piss a Pon = (2 ep r= 2,3, 


d » a = = 
(b) pale —t ) + PAC) + n(n + 1)P,(t) = 0, R= ON seen! 3 


(2n + 1)tP,(t) — nP,,_ ,(t) 


(Cc) Pit) = ied ; NSN 2 or 
(d) P,i)=1 and P,(—1)=(-1)’, n=0,1,.... 
1 dd" 
=O P24ty co a ee 
14. Let {¢,} bean orthonormal set in L,[a,b]. Show that {@,} is a basis if and onlyif 

oe) x. 2 
y { ¢,@ at =x-a 
n=1 |*a 


for all xin [a,b]. [Hint: Show that || f|]? = 4°, |(4¢,)|? for all step functions f.] 
15. Let {@,} be an orthonormal set inL,[a,b]. Show that {@¢,} is a basis if and only if 


If famaal O20 


n= | 


322 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


16. Show that Lemma 5.17.12 holds when M is a finite-dimensional linear 
subspace of an (incomplete) inner product space. Compare this result with the 
Projection Theorem (Theorem 5.16.4) and Example 1, Section 16. 


17. Let {x,} be an orthonormal set in an inner product space X, and also let 
xX =) ), %X,- Show that «, = (x,x,) and |x|]? = >, |(x.x,)|?. 

18. Let {x,} be an orthonormal set in an inner product space X, and let m be a 
positive integer. Let x be fixed and show that the set 


Bry = {Xq: lll]? < m|,x,)17} 


contains at most m — 1 elements. 


19. Let B= {x,,x,,...} be an orthonormal set in a Hilbert space H. Then let 
{z,} be a convergent sequence in H, where each z, is a finite linear combination 
of points in B, that is, Z, = 41% + 2X2 +°°* + Onn’ X,-. Show that for each 
j=1,2,... the sequence {«,,;} is convergent, that is, 


and that 
lim z, = lim Ge jy =) OX 
n- oo n’>o j=l j=1 
20. Is the completeness of H crucial in Corollary 5.17.13? 


21. Explain why Corollary 5.17.15 requires completeness whereas Lemma 5.17.14 
does not. 


22. Use Lemma 5.17.14 and the orthogonal projection P to re-interpret the Bessel 
Inequality. (See Figure 5.17.3.) 


M 
[ — Px 


Figure 5.17.3. 


18. EXAMPLES OF ORTHONORMAL BASES 


EXAMPLE 1. Let J be the interval [0,1] and H the complex space L,(/) with 
the usual inner product. We claim that the set 


$,{t) =e?" = =n=0, +1, +2,... 


5.18. EXAMPLES OF ORTHONORMAL BASES 323 


is an orthonormal set in H. Indeed, if n 4 m, then 


(?, Pm) ae [_ e2niniganim dt = [ e2nintg anim dt 
0) 


8) 
1 e 
= | e2rta— mt dt = 0. 
0) 
However, 


1 
Gal? = (bay) = J ete 2 dt = 1, 


Next we want to show that the set {6,:n =0,+1,...} is a maximal ortho- 
normal set. For this we need the following result. 


5.18.1 LemMMA. Let fe H be a continuous real-valued function on [0,1] and 
assume that 


f+, n=0, +1,.... 
Then f = 0. 

Proof: Notethatiff. @, for alln, then f . P for any finite linear combina- 
tion P of the ¢,,. We will now proceed by contradiction, that is, assume that f ¥ 0. 
We shall now construct a finite linear combination P for which (f,P) > 0, which 
gives us the contradiction. 


Since f# 0 one has f(t.) # 0 for some fy € [0,1]. Say that f(t.) > 0. Since f is 
continuous, there exists constants 6 > 0 and b > 0 such that 


f= 6b, for |t — to| < 0. (5.18.1) 
(We assume that 6 is chosen so that 0 < tj —6 <t) + 6 < 1.) Now let 
W(t) = 1 + cos 2n(t — ty) — cos 270 
and 
P(t) = [W(t)]", 
where JN is to be determined. Since 
2 cos 0 = (e'® + e~ ), 


w is a finite linear combination of the ¢,. Furthermore, for every N>0, P is a 
finite linear combination of the ¢,. Also P is real-valued. 
Define k by 


") 
k= (to +5) = 1 + ¢08 13 ~ c0s 2x5 > 1. 


324 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Since 
) 
W(t) > k, for |t— tol <5: 
one has 
x r) 
P(t) > k’*, for Jt — tol <5. (5.18.2) 


Furthermore, since 
W(t) > 1, for |t—t)| <0, 
one has 
P(t) >1, for |t—to| <6. (5.18.3) 
Similarly, since 
lW(t)| < 1, for |t—?t,|>06 and tel, 
one has 
|P(t)| < 1, for |t—t|>06 and tel. (5.18.4) 


Since fis a continuous function, it is bounded on J, that is, there is a constant 
M such that 


—-M<f(t)<M, for tel. (5.18.5) 
By using (5.18.4) and (5.18.5) we get 
PHS) = —M, for |f—t.|>6 and tel. 
And by using (5.18.1) and (5.18.3) 


P(t) f(t) = b= —M, for Slt ty <0. 
Hence 
P(t) f(t) = —M, for |t— tol 25 and tel. (5.18.0) 
Similarly by using (5.18.1) and (5.18.2) we get 
P(t) f(t) > bk, for |t—t| < 4 (5.18.7) 


Since P is a finite linear combination of the ¢, one has f L P. But this gives tn 
1 to 

o= (P= POs at= (J 

0 0 


> —M(1 — 6) + bk’, 


— 5/2 to+ 5/2 1 


+ J aa J 3) POLO dt 


5.18. EXAMPLES OF ORTHONORMAL BASES 325 


By choosing N sufficiently large, the right-hand side can be made positive and this 
leads to acontradiction. J 


5.18.2 Lemma. Let fe H be a continuous complex-valued function on [0,1] 
and assume that 


FADS, for: RO; ly Al ces 
Then f = 0. 
Proof: Weapply Lemma 5.18.1 to the two real-valued functions Re(/) and 


Im(/). It is for the reader to show that if f 1 ¢,, then Re(/) 1 ¢, and Im(/) 1 4, 
forn=0, +l1,.... ff 


5.18.3 THEOREM. The set {¢,:n=0,4+1,+2,...} is a maximal orthonormal 
set in L,[0,1]. 


Proof: Wewill show that if fe L,[0,1]and/f 1 @¢,for alln =0, +1,...,then 
f= 0, almost everywhere. Lemmas 5.18.1 and 5.18.2 establish this when fis continu- 
ous. If fis not continuous, let 


G(t)=—-t [ f(s) ds + f f(s)ds  and~—s F(t) = G(t)— [ Go dt 


for 0 < t < 1. It is known (see Appendix D) that F is continuous and differentiable 
and that F’(t) = f(t) — J§ f(s) ds, almost everywhere. By integrating by parts one 
easily gets 


1 
{ F(tje?""dt=0, forn=0, +1,..., 
¢) 


that is, F L ¢,. It follows from the above that F = 0, which in turn implies that 
f=constant. Since f 1 ¢) one hasf=0. J 


EXAMPLE 2, Let J =[a,b] be any other bounded interval with a < 8, and let 
H be the complex space L,(/) with the usual inner product. Then 


b(t) = (b — a)" exp (2xin : ‘), peas 


forms a maximal orthonormal set in L,(/). Jj 


EXAMPLE 3. Let Y denote the linear space made up of all complex-valued 
functions f(z) defined on the unit circle [ of the complex plane such that 


| ee ceglZ 
sof ior S| <o, 


326 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


where the contour of integration is the unit circle, and 


i —— dz 
(£9) =<—§ /@I@—, 
Tle Zz 
where g is the complex conjugate of g. [This is the space of all two-sided z-trans- 


forms of sequences in /,(—00,00).] We claim that the set {...,z~7,z~4,1,z,z7,...} 
is a maximal orthonormal set. In fact, this is merely Example | in disguise. J 


EXAMPLE 4. (MULTIPLE FOURIER SERIES.) Let J be the rectanglein R™consist- 
ing of all t= (¢,,...,¢,,) suchthat 0<t,;<1,i=1,...,m. Let w= (yy,...,u,,) be 
an m-vector with integer components and define 


eet = (wt) = wt +00 + Unt 


Let |||] = (uw: w)'/*. Let A denote the class of all such p. If f is a complex-valued 
function defined on J we let {, f dt denote 


ie dt = [- vs f fe, caste alee tdi. 


The space L,(/) 1s the space of all measurable complex-valued functions f for which 
\,|f |? dt < co. The inner product on L,(/) is given by 


(f9) = | £9 dt. 


For each yp in A we define 
y(t) = e729, (5.18.8) 


Then @,(t) = @_,(t). Let p = (p,,..-,Pm) = w — v, then 


(>, ,,) ee pereperi-mp dt 
I 
= jeer’ dt 
I 
— [een apt RRS + Pmtm)] 
I 


1 ° 1 ° 
=| ectipits dt, bees ‘| ect Pmtm At, . 
0 0) 
If p # 0, then for some coordinate one has p, # 0, and 
1 
| ert Piti dt, = 0. 
0) 
This implies that (¢,,0,) = 0 if u # v. Similarly one gets 


($,.6,) = fi dt=1. 


5.18. EXAMPLES OF ORTHONORMAL BASES 327 


Hence the family 
{P,: HE A} 


is an orthonormal family. By reasoning similar to that used in Example | we see that 
this family 1s maximal. 


The Fourier series expansion in this case is 


f=VLieudy, (5.18.9) 
pb 
where c, = (f,@,). This is sometimes written as 
Tis wcaete) se », ous yer (5.18.10) 
Hty oes Um 


For obvious reasons, the notation in (5.18.9) 1s preferable to that of (5.18.10). J 
EXAMPLE 5. It is easy to see that, 
€, = (0,1 Ona 5° ++) a 
defines an orthonormal basis for /,[0,00). Similarly 


e, = (. 89 On, —159n029nt>On2 oe ) 


defines an orthonormal basis for /,(— 00,00). (This is probably a good time for the 
reader to review Example 2 of Section 4.7.) J 


EXAMPLE 6. The Laguerre functions form an orthonormal set for L,[0,00) 
(compare with Exercise 8). These are defined by 


1 
é()=—e"PL,(), n=0,1,..., 
n} 
where L,(t) is the Laguerre polynomial 
L,(t) = eD"(t"e~') = > (- D(z) n(n —1)°-(k +10 
k=0 


and D®=d"/dt". Jj 
EXAMPLE 7. The Hermite functions 
e7X7/2 a 
[2"n! /ny/? 
form an orthonormal basis for L,(— 00,00), where H,(x) is the Hermite polynomial, 


H,(x) = (—1)"e*’D"(e7*’). 


b,(x) = (x) n=0,1,... 


(See Section 7.14.) Jj 


328 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


EXAMPLE 8. One can construct an orthonormal basis for L,(R”) as follows: 
Let n=(n,,...,n,,) be a vector with nonnegative integer entries and also let 
x =(X1,...,X,,) be a point in R”. Define 


H,(x) = Hy, (%1)An,(%2) °° Ang %m)> 
where H,,, is the Hermite polynomial of order n;. Let 
n! =(n,!)(n,!)--- (n,,!), Yas Hea Nina Bey adh 
and 
ea §l?/2  g-*17/2 . 9-2/2... go Xm?/2, 
Then 
— |x|?/2 


= ——_—__———_ _ H] 
P(X) [2"n! Jn}? 


n(X) 
is an orthonormal basis for L,(R”). J 


EXERCISES 


1. Prove that the set {6,:n = 0, +1,...} defined in Example 2 forms a maximal 
orthonormal set in L,[a,b]. [Hint: Define the real-valued function s = a(t) by 


t-—a 


b-—a 


Then «: [a,b] >[0,1], and «~': [0,1] >[a,b]. Now define an operator K by 
y = Kx, where 


y(t) = (6 — a)? x(a(2)). 


Show that K is a linear mapping of L,[0,1] onto L,[a,b] and ||Kx|| = ||x|l. 
Now compare the basis in Example 1 to that of Example 2. What happens if 
a= b?| 


2. Show that the family 
{1} U {,/2 cos 2ant: n = 1,2,...} U {,/2 sin 2ant:n = 1, 2,...3 


forms an orthonormal basis for L,[0,1]. How is this related to the basis in 
Example 1? 


3. Use the methods of Examples 2 arid 3 to find a maximal orthonormal set for 
L,(1) where J is a rectangle in R™ given by 


T= {t=(t,,...,t,): 4,5 ¢,35,,i= 1,..., m}, 


where a; < b;. 


5.18. EXAMPLES OF ORTHONORMAL BASES 329 


4. In Exercises 12 and 13 of Section 17 it was shown that the Legendre poly- 
nomials 


1a, 
© 2 —1)" 2259. Pi 
Mtge ee eles 


P,(t) = 


form an orthogonal set and that x,(t) = [(2n + 1)/2]/*P,(t) forms an ortho- 

normal set for L,[ —1,1]. In this exercise we shall prove that the orthonormal 

set {x,:2=0,1,...} is maximal. The proof is similar to the argument of 

Example 1. 

(a) Show that f= 0 is the only continuous, real-valued function f on [—1,1] 
for which {*, f(t)t"dt =0. In other words, if y,(t) = ¢" we claim that if 
f Ly, for all t, then f= 0. 

(b) Next show that the only continuous, complex-valued function f on [— 1,1] 
for which {*, f(t)t" dt =0 is f = 0, that is, the only continuous, complex- 
valued function for which fl y,, n=0,1,...,isf=0. 

(c) Now show that the orthonormal set {x,,: 1 = 0,1,...} is maximal. 


5. Let X denote the collection of all functions f(z) that are analytic for |z| < 1 and 
such that 


{{ | f(z)|? dx dy < 0. 


|z|<1 


(a) Show that 


(fg)= |] sGdxay 


Jz|<1 


defines an inner product on X. 

(b) Let ¢,(z) = (n/n)'/2z"~! for n = 1,2,.... Show that {¢,} forms an ortho- 
normal basis for _X. 

(c) Compare the Fourier coefficients of f with the coefficients in the power 
series expansion for f. 


6. (Isoperimetric Theorem.) Show that among all simple, closed piecewise smooth 
curves of length LZ in the plane, the circle encloses the maximum area. [Hint: 
Proceed as follows. 

(a) Let x = x(s), y= y(s), O< s<L, be a parametric representation of the 
curve using the arc lengths as a parameter. Let t- L = s and let 


x(t) = ay + 2 » (a, cos 2nnt + b, sin 2nnt) 


y(t) = Co + of 2 Y> (c, cos Zant + d, sin 2mnt) 
n=} 


330 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


be the Fourier series expansions for x and y on the interval 0 < t < 1. (See 
Exercise 2.) Show that 


(=) = (=) _ (=) 
dt dt dt ° 


tT /dx\*  /dy\? 00 
P= ae rae = 22279 y) 2 2y 
J, (F) + (FZ) | a \ 4n?n*(a,” + b,? +c,” + d,”) 


n=1 


(b) Show that the area A satisfies 
1 dy roe) 
A= —adat= = , 
[> a dt = Y 2an(ay dy — By) 
(c) Show that L? — 4nA >0 and that equality holds if and only if a, = d,, 


b, = —c,, and a, = 5b, =c, =d, = 0 for n = 2, 3,..., which describes the 
equation of a circle.] 


. Use Example 8 to show that if A is any measurable set in R”, then L(A) has a 


basis that is at most countable. 


8. The object of this exercise is to show that the Laguerre functions form an 


10. 


Il. 


orthonormal basis for L,[(0,00), see Example 6. 
(a) Show that {@,} is an orthonormal set. [Hint: Compute 


roa) oe) qd" 
—t J = k ni—t 
J, e~'t*L (t) dt J, t aplte ) dt 


by integrating by parts.] 

(b) Show that {¢,} is maximal. [Hint: Modify the argument of Example |, 
Section 18 by replacing P(t), in Lemma 5.18.1, by a function of the form 
e/?O(t) where Q(t) is an appropriate polynomial in ¢.] 


. Show that 


> ~ t"L, (x) =(1 — t)7' exp [—xt(1 — 1)~ 4], 


where L,(x) is the Laguerre polynomial. 


In Example 3, Section 14, we used the functions {e~‘,te~',t7e~*,...}. 

(a) How are these related to the Laguerre functions? [Hint: Look at the 
Gram-Schmidt orthogonalization process. ] 

(b) Is it true, in this example, that 


lim | Ih(t) — h,(t)|? dt = 0, 
n7oo *O 


where A and h, are defined in Example 3, Section 14? 


Use the results of this section to re-examine Example 2 of Section 4.7. 


5.19. UNITARY OPERATORS 331 


12. This is the outline of another proof that the Laguerre functions form a maximal 
orthonormal set in L,[0,00). 
(a) Define h(x,t) and g(x,t) by 


heat) = (1 — 97 exp gs = — ~ L(x) 


o(x,t) = exp( Siena ) =(1— 1)! exp (1 + 1 — x] 
= ¥ 1b,(2). 
Show that 
[ looP dx = 2)! 
0) 
and 
f aC%t)b(x) dx = 2" 
(b) Show that 
2 1 N 
dx = he? a 


roe) N 
[Jaan — ¥ 16,00) 
0 n=0 

(c) Show that every function of the form e~**, (0 < a < o) can be approxi- 
mated arbitrarily closely in L,[0,00) by a finite linear combination of 
Laguerre functions @,. [Hint: Use (b) with a = 4({1 + t}{1 — ¢}).] 

(d) Show that every function in L,[0,00) can be approximated arbitrarily 
closely in L,[0,00) by a finite linear combination of functions of the form 
e **, 0 <a< oo. [Hint: Perform a change of variables y = e-* mapping 
0 <x < o into [0,1] and then use the fact that the polynomials in y are 
dense in L,[0,1].] 


19. UNITARY OPERATORS AND EQUIVALENT INNER PRODUCT SPACES 


We have discussed the question of equivalence at various levels. For example, 
equivalences between metric spaces (homeomorphisms and isometries), equiva- 
lences between linear spaces (isomorphisms), and equivalences between normed 
linear spaces (topological and isometric isomorphisms). The basic idea of equiva- 
lence is the same in each case, that is, there exists a one-to-one mapping of one space 
onto the other and this mapping preserves the given structure. We now introduce 
the analog for inner product spaces. 


5.19.1 DEFINITION. Let X and Y be two inner product spaces. We say that 
X and ¥Y are unitarily equivalent if there is an isomorphism @: X > Y of X onto Y 
that preserves inner products, that is 


(P(x1), O(X2)) = (%1,X2) 


for all x,, x, € X. The mapping ¢ is referred to as a unitary operator. 


332 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Since unitary operators preserve inner products, one has ||@(x)|| = ||x|| for all 
x € X, so ¢ is also an isometric isomorphism. In fact, the converse is also true. 


5.19.2 THEOREM. A mapping ¢ is an isometric isomorphism of X onto Y, where 
X and Y are inner product spaces, if and only if @ is a unitary operator. 


Proof: The only issue here is to show that if ||6(x)|| = ||x|| for all x eX, then 
(p(x), d(y)) = (x,y) for all x, y e X. Since ¢ is linear we have 


4(P(x), P(Y)) = (P(x + y), H(x + y)) — (P(X — y), P(X — Y)) 
+ i(p(x + iy), O(x + iy) — (G(x — iy), (x — iy) 
=(x+y,x+y)—(x—y,x—y) 
+ i(x + iy, x + iy) — i(x — iy, x — iy) 
=4(x,y). J 


Needless to say, if two Hilbert spaces are unitarily equivalent, they are essen- 
tially the same Hilbert space and differ only in the nature of the points in their 
underlying sets. 

We shall show later (Theorem 5.22.7), after we introduce the concept of the 
adjoint, that an operator @ is unitary if and only if ¢~ * = @*, where ¢* is the adjoint 


of @. 


EXAMPLE 1. Let X be a finite-dimensional complex inner product space with 
dim X = n. We shall now show that X is unitarily equivalent to C” with the usual 
inner product. 


Let {e,,e,,...,€,+ be a Hamel basis for X. By using the Gram-Schmidt ortho- 
gonalization process, if necessary, we can assume that {e,,e,,...,e,} iS an ortho- 
normal basis. It follows from the Fourier Series Theorem that for every x € X there 
are complex numbers x; = (x,e;), | <i <n, such that x = )7., x,e;. Now define a 
mapping ®: X¥ > C” by 


OX (Kine 5X): 


It is clear that ® is an isomorphism of X onto C” since {e,,e,,...,e,} iS a basis. 
Furthermore, Parseval’s Equality (Theorem 5.17.8) assures us that 


(x,y) = ¥ (x,e)0,e) =>,» iVi = [Ox,Oy], 


where [-, -] denotes the usual inner product on C”. Hence ® isa unitary mapping. J 


EXAMPLE 2. Consider the space L,[0,1]. Let us show that this ts unitarily 
equivalent to /,(— 00,00). It is shown in Example 1, Section 18, that ,(1) = ¢?""" 
forn=0, +1,... forms an orthonormal basis for L,[0,1]. By the Fourier Scries 


5.19. UNITARY OPERATORS 333 


Theorem 5.17.8 one can find for every x € L,[0,1] a bi-sequence of complex num- 
bers (...,%_1,Xo,X1,X2,---) where x, =(x,@,) such that x =), x,,. Now define 
i. mapping F: L,[0,1] > 1,(— 0,00) by 

| ap da (are eer ee, aes ere ere 


It is clear that F is linear. It is also one-to-one since x, = (x,@,) =0 for all n, 
implies that x = 0 [Theorem 5.17.8(e)]. Also the range is all of /,(—00,00) by 
Lemma 5.17.10(a). So we see that F is an isomorphism of L,[0,1] onto /,(— 0,00). 
Furthermore, Parseval’s Equality [Theorem 5.17.8(c)] assures us that 


(x,y) = ¥ (x,6,)1),bn) = ¥ Xn Fn = LFXF I, 


where [-, :] denotes the usual inner product on /,(— 00,00). Hence F is a unitary 
mapping. fj 


It should be clear that we could replace L,[0,1] with L,(7) where J is any inter- 
val in R, or more generally, where J is any rectangle or set in R™. All that is needed 
in order to get the unitary equivalence is that L,(7) have a countable orthonormal 
basis. 


EXAMPLE 3. Let Z be the Hilbert space of z-transforms defined on the unit 
circle that was introduced in Example 4, Section 4.5. Let U be the mapping of Z 
into L,[0,1] defined by 


(Uf\(t)=f(e*™), — te [0,1], 


where f is a point in Z. The basic idea here is to “‘stretch and wrap”’ the interval 
[0,1] around the unit circle. If ue L,[0,1], it is easily shown that 


(U ~*u)(z) = u[t(z)], 
where t = ¢t(z) = —arg(z)/2x. The inner product in Z is defined by 
1 —~ dz 
(L):= 55 Ff @9@)—. 
ni Z 
If uw and v are two points in L,[0,1], then 
1 ——_— d 
(U-tu, U~*»), = — $ ul 2) Jolt] — 
2ni Z 
und a simple change of variable of integration shows that 
L-.. fazed 
(U~*'u, U~*v), =| u(t)o(t) dt = (u,v), 
0) 
for all u, v in L,[0,1]. So U is a unitary operator mapping Z into L,[0,1]. J 
EXAMPLE 4. Wecan combine the operators F and U from the preceding two 


examples to show that the two-sided z-transform 2 is a unitary operator. We again 
refer the reader to what has already been said about the two-sided z-transform in 


334 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Figure 5.19.1 


Example 4, Section 4.5. As cited in that example, # is a mapping of /,(—00,w) 
into the Hilbert space Z of Example 3 above. So we have the situation illustrated 
in Figure 5.19.1. Let us show that 2% = U~'F™!. It follows from the preceding 
examples that 


where x = {...,X_1,X9.X1,X2,...}, and 


UNE = > x2". 


n= —- © 


So ¥=U~'F~!. Hence, & is unitary. In particular, we have 


1 2d4z  & 5 
maf fOr =D lal’, 
where f=Z%x. J 


Some care should be taken in reading the definition of a unitary mapping. A 
unitary mapping must satisfy four conditions: It must (1) be linear, (2) be onc-to- 
one, (3) be “‘onto,”’ and (4) preserve inner products. 


EXAMPLE 5. Consider the (left) shift operator S, on sequences. That in, 
y =S,x means y, = X,+1;- This is a unitary mapping on /,(— 00,00). However, '\; 
is not one-to-one on /,[0,00) since S,(1,0,0,...) = 0. 

The (right) shift operator S, is merely the inverse of S, on /,(— 00,00) and it In, 
of course, a unitary mapping there. On /,[0,00) S, is defined by y, = x,_, fur 
n=1,2,...and yo = 0. In this case the range of S, is not all of /,[0,00) even though 
this mapping is linear and one-to-one and preserves inner products. J 


The Fourier transform is, of course, a famous example of a unitary operator, 


EXAMPLE 6. (FOURIER TRANSFORM ON R?.) The Fourier transform F and its 
inverse ¥ ~* are given by the following equations: 


fio) = (Ff io) = | ex F(x) dx, (5.19.1) 


0c 
— © 


$0) =(F Pa) == [el fio) do (5.19. 


=m 1%) 


5.19. UNITARY OPERATORS 335 


We will show in Example 12, Section 22 that F is a unitary mapping of L,(— 00,00) 
onto L,(—ioo,ioc0) and that ¥~' is given by (5.19.2). 


In the literature the transformation 


g(y) = (FP)(y) = Pf (x) dx (5.19.3) 


1 CO 
Eel | e 
Sf 20 
is sometimes referred to as the ‘‘ Fourier transform.” It is clear that both F and ¥ 
are related and that F~' is given by 


f(x) = (F-49)(x) = = [" e*a(y) dy. (5.19.4) 


Furthermore, since ¥ is a unitary mapping we see that F is a unitary mapping of 
L,(— 00,0) onto itself. J 


EXAMPLE 7. In this example we show the use of orthonormal sets and unitary 
operators in a classic sampling theorem. 

This will be one of the few cases where we consider the real instead of the 
complex Hilbert space H, = L,(— 00,00). In particular, the points in H, are real- 
valued functions and, of course, the scalars are real too. This restriction to a real 
space is not necessary, but simplifies things and conforms to the usual applications. 

If we take the Fourier transform of points in H,, we obtain all the complex- 
valued functions X(iw) such that 


1pe 
(a) =| Ixia) dw < 0, 


(b) Re[_X(iw)] is an even function about w = 0, that is, 
Re[_X(iw)] = Rel X(— i) ] 
for almost all @, and Im[_X(i@)] is an odd function, that is 
ImX (iw)] = —Im[X(—io)] 
for almost all o. 


Using the natural structure on this set of functions and the real numbers as scalars, 
we have a real Hilbert space H,, with the inner product defined by 


(X,Y) = = : X(iw)¥(a) do. 


Note that (X,Y) is real for all X, Ye H,,. 

It can be shown, using the preceding example, that the Fourier transform ¥ 
(in the L, sense) is a unitary mapping of H, onto H,,; therefore, H, and H., are 
unitarily equivalent. The usual interpretation is H, is the time domain and H,, is 
the frequency domain associated with real systems.'® 


‘© That is, systems that map real-valued input functions into real-valued output functions. 


336 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Now let us consider the closed linear subspace M, of H, defined by 
M, = {xé H,:(¥x)(iwa) = 0 for |a| => a,}, 


where w, > 0 is a real number. One often says that x € M, is “‘bandlimited to fre- 
quencies below f. = w,/22”’ or “‘ contains no frequency components above f,.”’ 
Since!’ £ = ¥x is in L,(—ic,ico) MN L,(—ico,ico), for all x e M,, it follows 
from the theory of the L,-Fourier transform that x = ¥~‘& is (i) continuous, 
(ii) bounded, and (iii) lim,_, + ,, x(t) = 0. We hasten to add that there are functions 
in H, satisfying (i), (11), (411) that are not in M.. (See R. Goldberg [1].) 
Next let .@. denote the closed linear subspace of H,, defined by 


M.= {XE H,,: (iw) = 0 for |a| > w,}. 


Obviously, F(M.) = @,., so M, and .@, are themselves unitarily equivalent 
Hilbert spaces. 
We state that the set {¢,} in M,, where 


1 sin(@,t — nz) 
t) = —= —_—_—_ yee Os We De ages 


is an orthonormal set. We also state that ¢, = ¥¢, is given by 


rt, 
JE ett lal < la, | 
c 


?,(i@) = n=..., -1,0,1,2,..., 


0, |o| > |a,I, 


where T = z/w,. The set {¢,,} is, of course, an orthonormal set in .@,. Moreover, 
it follows from Example 2, Section 18 that {¢,} is maximal in .@,. Thus, {¢,} is 
maximal in M,. It follows, then, that if x is any point in M,, it can be expressed 
in the form 


[e.@) 
x= Y Obie: 
Since $,(“T) = (1/./ 1) Syn s where 6,, 1s the Kronecker function, one would 
suspect that for continuous functions x, one has 
1 
WE: 


and this is in fact the case. In order to show this one would start with 


x(KT) = —= (x,9x)s 


N 


x — > (x,O,)P, 


k=-N 


lim —( 


N->o 


17 Tt is in L, and has compact support; therefore, it is also in L,. 


5.19. UNITARY OPERATORS 337 


and use the continuity of x together with the inequality 


: 26 Pr) Pr 


k#n 


< K|w,t — nn| 


for —%7 <w,t — nn <1, where K > 0. We leave the details as an exercise. 
In summary, we have that any continuous x € M, can be written 
2 sin(w,.t — km) 


x(t)= Y x(kT) 
k 


Da pee (5.19.5) 


The usual interpretation of (5.19.5) is that a bandlimited signal can be complete- 
ly recovered from samples of its value [that is, x(kT),k =..., —1,0,1,2,...] as 
long as the samples are taken frequently enough. Finally, we remark that this is just 
one of a large family of results concerning “‘sampling,”’ see Beutler [1]. J 


EXAMPLE 8. Let H =L,(— 0,00) and consider the mapping 7 of H into 
itself defined by 


_ fx(t), fort >0 
ty = oo for t <0. 


T is obviously linear. Moreover, 


O'™ 225-2 oO _ 
(TxTy) =] y@Ox)dt +] y(Ox(0) at = (x,y). 
Furthermore, 7'* = J, so T is invertible. It follows that T is a unitary operator. 


Let P, denote the orthogonal projection defined by 


_ {x(t), fort >0 
CEs ‘ fort <0 


and let P_ denote the orthogonal projection defined by 


0, fort >0 
x(t), fort <0. 


Note that P, +P_.=Jand P, P_=P_P, =0. In addition, 
TAP x +A,P_, 


(P_ xt) =| 


where 4, = 1 and A, = —1. If we let M, denote those functions x(/) in L,(— 0,00) 
that vanish for t<0, and M-_ denote those functions that vanish for t>0, 
then 7x =x for all xe M, and Tx = —x forallxeM_. Furthermore, one has 


M,+=M_, M_t+=M, and H=M, +M_. 


If we consider continuous linear mappings of L,(— 00,00) into itself that can 
be represented in the form 


2(t) = { k(t — t)x(t) dt, te(—0,0), 


338 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


where k € L,(— 00,0) A L,(— 0,0), then the k’s that correspond to causal sys- 
tems are exactly those that satisfy the equation 7k = k. By using the Fourier trans- 
form ¥ the equation Tk = k becomes 


TF 'Fk =F 'Fk, or (FTF "'k =k, 


where k = ¥k is the Fourier transform of the “ unit impulse response” k. 

Let J = FTF ~', and let us see how J transforms a real-valued function r 
in L,(—ioo,ioo). First the inverse Fourier transform of r has an even real part and 
an odd imaginary part. The T operating on ¥ ~‘r converts the even real part to an 
odd function and the odd imaginary part to an even function. But the Fourier 
transform of a complex-valued function with odd real part and even imaginary 
part is a function with zero real part. So 7 maps real-valued functions into imagin- 
ary-valued functions. 

If we let k = p + io, where p and a are, respectively, the real and imaginary parts 
of k, then it follows that Tk = k becomes 


Tpt+iFo=p tio. 
But J p is imaginary-valued and i7o is real-valued, so the above equation becomes 


p=i7o 


o= —-Il7 p. 


In other words, as is well-known, the real and imaginary parts of k are interdepen- 
dent for causal systems. J 


Finally let us briefly return to the topic of equivalent operators, which we began 
to study in Section 4.9. 

Clearly, everything said in that purely algebraic context is still applicable in 
normed linear spaces, Banach spaces, and here in inner product spaces. It is also 
obvious that now with topological structure present one not only wants linear 
transformations to be equivalent in an algebraic sense but also in a topological 
sense. Whereas isomorphisms were used in Section 4.9, in this new context one 
would usually use topological and isometric isomorphisms. The classic case of this 
is two operators being unitarily equivalent. 


5.19.3 DEFINITION. Let T: X, - X,and S: X,— X, be linear operators where 
X, and X, are inner product spaces. The operators T and S are said to be unitarily 
equivalent if there exists a unitary mapping U of X, onto X, such that 


T=U~'SU and S=UTU™. 


This situation is illustrated in Figure 5.19.2. 

Undoubtedly two of the best known examples of pairs of operators being 
unitarily equivalent arises in the study of time-invariant linear operators on 
L,(— 0,00) and /,(— 00,00), as we now sce. 


5.19. UNITARY OPERATORS 339 


Figure 5.19.2 


EXAMPLE 9. Let X, = L;(— 00,00), and let T be a bounded linear time-in- 
variant transformation of X, into itself. We also know from Example 6 the Fourier 
transform F is a unitary mapping of L,(— 00,00) onto L,(—i00,i00). The operator 
S: L,(—io,ioo) > L,(—ioo,ioo) defined by 

S= FTF 
is obviously unitarily equivalent to T. The fundamental fact about S is that S has a 
simple form. In particular, 
(S&)(iwm) = T(iw)X(ia), 

where 7 (iw) is the transfer function associated with the operator 7. In other words, 
S is an operation of multiplication by a function, see Bochner [1]. 

It should be remarked that if causality is left aside there is a one-to-one corre- 
spondence between bounded linear time-invariant mappings T of L,(— 0,00) into 


itself and measurable essentially bounded complex-valued functions T(i@) defined 
on the imaginary axis. Jj 


EXAMPLE 10. Let X =/,(—0,00), and let 7 be a bounded linear time-in- 
variant transformation of X into itself. Let Z be the inner product space of complex- 
valued functions defined on the unit circle of the complex plane used above in 
Example 3. We showed in Example 4 that the two-sided z-transform #%: X > Z isa 
unitary operator. It follows that the operator S: Z— Z defined by 

S217" 
is unitarily equivalent to T. As with Example 9, the fundamental result here is that 


(Sf)(z) = T(z) f(z) 
for all z such that |z| = 1, where 7(z) is the transfer function associated with the 
operator T. J 


EXERCISES 


|. Show that /,(— 00,00) and /,(0,00) are unitarily equivalent. 


2. Show that every separable Hilbert space is unitarily equivalent with /, or C" for 
some n > 0. 


340 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


3. Let U be a unitary mapping on an inner product space X. Show that ||U|| = 1. 


4. Let = [a,b] be a bounded interval and define F: L,(1) > LU) by y = Fx or 
y(t) = f(t)x(t), where fe L,,(/). Show that F is a unitary mapping if and only if 
| f(t)| = 1, almost everywhere. 

5. Let X be an inner product space and let B be a Banach space that is the comple- 
tion of X. (See Exercise 6, Section 9.) Show that Bis a Hilbert space. 

6. Let U be a unitary operator on a Hilbert space H and define 

,»=(U/m(l+Ut+-::-+U"*"] for n=1,2,.... 
Let M = {x — Ux: xe H}. 
(a) Show that if ye M, then ||A, y|| ~0 as n- 00. 
(b) Show that if y e M, then ||A, y|| ~0 as n- oo. 


(c) Show that if xe M*, then Ux = x. 
(d) Show that P = lim A,, where P is the orthogonal projection onto 


Mt+=N(I-— U). 
(e) Show that ||A, — P|| - 0. 


7. Let H, and H, be two Hilbert spaces with the same Hilbert dimension. (Compare 
with Exercise 6, Section 17.) Show that H, and H, are unitarily equivalent. 


8. Let P be an orthogonal projection on a Hilbert space H. Also let U, =e", 


—0 <t< 0. 
(a) Show that for each ¢, U, is a unitary operator. 
(b) When is P a unitary operator? 


20. SUMS AND DIRECT SUMS OF HILBERT SPACES 


We introduced the concepts of sum and direct sum in Section 4.10. Again, 
since Hilbert spaces are linear spaces, everything said about sums and direct sums 
in Chapter 4 applies here. The purpose of this section is to see what else can be said 
now that we have the topological structure generated by an inner product. More 
specifically we shall be interested in the question of sums of mutually orthogonal 
closed linear subspaces of a Hilbert space. 

In Section 15 we discussed the properties of sums of finitely many mutually 
orthogonal closed linear subspaces (see Theorem 5.15.5 and Exercise 3, Section 15), 
So in this section we shall focus our attention on infinite sums of these spaces. Fur 
simplicity we shall consider only countable sums. The discussion in Section 17 on 
uncountable sums also applies here and the results we describe below readily extend 
to uncountable sums. 


5.20.1 DEFINITION. A collection {M,} of sets in a Hilbert space H ts sau 
to be mutually orthogonal if M, l M,, whenever n # m. If {M,} is a collection of 
mutually orthogonal closed linear subspaces of a Hilbert space H we shall define 
the (topological) sum 

M=)M,=M,+M,+4+°°: 


to be the closure of the linear subspace of // generated by {M,}. 


5.20. SUMS AND DIRECT SUMS OF HILBERT SPACES 341 


It should be noted here that this definition does not agree with the algebraic con- 
cept introduced in Section 4.10. The algebraic sum of these spaces would consist of 
ill finite sums of vectors from {M,}, whereas the topological sum also includes 
limits of sequences of these finite sums. Because of Theorem 5.15.5 and Exercise 3, 
Section 15 we see that the topological sum and the algebraic sum agree when the 
collection {M,,} is finite. It is only when the collection of spaces {M,,} is infinite that 
these concepts differ. 


The next theorem shows that each point in M, + M,+M,+-°::: can be ex- 
pressed as a Series. 


5.20.2 THEOREM. (ORTHOGONAL STRUCTURE THEOREM.) Let {M,} be a count- 
able collection of mutually orthogonal, closed linear subspaces of a Hilbert space H, 


and let M=M,+M,+4--°- be the sum as defined above. Then each x € M can be 
expressed uniquely as 


x=) Xp 
n 
where 


x,E€M,, forall n, 
and 


2 
* =D lal *. 
n 


Furthermore, if x, €M,, and Y., ||X,||7 < 00, then there exists an xe€M such that 


Ma) 4 Xps 


Carefully note that this is a statement about the orthogonal structure of Hilbert 
spaces, not inner product spaces in general. Indeed, it is not difficult to find a 
collection {M,} of mutually orthogonal closed linear subspaces of an incomplete 
inner product space XY such that Theorem 5.20.2 is not satisfied.'® 


Proof: This proof isvery much like the proof of Lemma5.17.14. Since x e M, 
there is a sequence { yy} converging to x, where each y, can be expressed as 


Yn =Xny + Xn2 + °°* + Xx, N= 1; 233 


where xy, € M,, and K depends on N. For simplicity we set x,, =0 forn>K. 
Since M, is closed, there is (Theorem 5.14.4) a unique point x,¢M, such 
that ||x — x,|| < |x -— ml for all me M, and x—-x, | M,. In particular, then 
|x — x,|l < |x —xy,|| for all NM and n. Since M, + M,, is closed (Theorem 5.15.5), 
there is a unique point zeM,+M,, such that ||x —z|| < ||x—m|| for all 
meM,+M,, and x—-z1lM,+M,,. Let us show that z=x,+.x,,. Since 
zeM,+M,,, there is a unique decomposition z = z, + Z,,, where z,¢€M, and 
Zn € M,,. Because x —z 1 M, + M,,, we have (x — z, m) = 0 forall me M, + M,,. 


'§ The topological sum &,, M, is defined just as in a Hilbert space. 


342 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Hence, (x — Z, — Z,,m,) =0 for all m,eM,. Therefore, (x — z,, m,) = 0 for all 
m, € M,,. It follows that z, = x,. Similarly, one has z,, = x,,, OF Z =X, + Xm- 
In general, then, 


|x — xy — Xp — 7+ — XRl| S |x — yy. 


Since x = limy_.,. yy, it follows that 


ee 
The uniqueness of the expansion and the equality ||x||* = >, |x, ||? follows from 
Lemma 5.17.10(b), and the last part of the theorem is a consequence of Lemma 
5.17.10(a). fj 


EXAMPLE 1. The importance of the Orthogonal Structure Theorem will 
become evident in the next chapter. However, we can give a preview here. Let H be 
a Hilbert space and let L: H — H be a bounded linear operator. Assume that there 
exists a collection {M,} of mutually orthogonal nontrivial closed linear subspaces of 
H with H=)_, M,. Assume further that there is a sequence {/,} of scalars with 
the property that Lx, = 4,.~x, for all x, 6 M,. Finally, assume that the 4,’s satisfy 
Ant <|A,| for all n. Now if xe H and if x=) ,x,, where x,6M,, then 
Lx = LOY a Xn) = Yin LXn = Yin An Xn- The series 5’, 2,X, converges since 


Wn Xnll? <DAal call? = lal? Ill? 


The last inequality also shows that ||Lx|| < |A,|||x||, in other words, ||Z|| < |/,|. One 
can actually show that ||L]| = |1,|. Indeed, if x e M, is chosen so that ||x|| = 1, one 
then has 


Ay] = [Ay iol] = [Ayal] = Lxl] < LI: xl = 2). 0 


Next let us turn to the question of direct sums of inner product spaces. [.ct 
X, and X, be two inner product spaces. The direct sum X, ® X,, as a linear space, 
was introduced in Section 4.10. We will use, perhaps somewhat confusingly, exactly 
the same notation to denote the direct sum of the inner product spaces X, and Y,. 
If( -,-° ), and (-,-°),. are the inner products, X, and X,, respectively, then 
the inner product on X, ® X; ts defined by 


(x,y) a (XV) + (x2 V2)2 ’ 


where x and y are the ordered pairs {x,,x,} and {),,y,}, respectively, with 
X1,, € X, and x,, y, € X,. We leave it to the reader to show that this does define 
an inner product. We also remark that if X, and XY, are complete, then so is 
X, 8X. 

If {X,,...,X,} is a finite collection of inner product spaces, we define 
X,®X,@°°:@®X, in the obvious way. However, if {X;} is a countably infinite 
collection, we have to say a few things. 


5.20. SUMS AND DIRECT SUMS OF HILBERT SPACES 343 


Let {X;} be a countable collection of inner product spaces, with (°, -),; 
denoting the inner product on X,. Now define the set 


X = X,; Xx X> X X4X°"° 
as follows. Each point x € X is a sequence, x = {x;}, where x,6 X,, i=1,2,..., 


and such that 
> x;I|;7 < 00. 
t 


Using the natural structure on X, it is a linear space. We claim that 
(x,y) — 2, (x; Vidi (5.20.1) 


where x = {X,,x2,...} and y = {y,,y2,...} are in X, defines an inner product on 
X. Let us prove that the series (5.20.1) is absolutely convergent. By using the 
Schwarz Inequality twice, we get 


DNs ydil < Dd leah Lyall: 
»\? »\ 1? 
< (Edel?) (¥ ty) 


showing that the series in (5.20.1) is absolutely convergent. The rest of the proof that 
(5.20.1) defines an inner product is straightforward. 
We denote XY equipped with the inner product (5.20.1) by 


X=X,0X,0X3@°°: 


and refer to it as the direct sum of {X;}. 
Let us now prove that if all the X,’s are complete, then X is complete. Let 
{x"} be a Cauchy sequence in X. Since 


x; — vill; S lx — yl, 


each {x,"} 1s a Cauchy sequence in X;. Since X; is complete, each x,” converges, say 
that x,"— y; as n> co. Now let y= {y,,y2,...}. We want to show that ye X. 
Since {x"} is a Cauchy sequence, for every ¢é > 0 one can find an integer N such that 


Dae xl? <e, 
t 


whenever n,m > N. If we let m-— oo, the last inequality becomes (compare with 
Exercise 9, Section 2) 


Ma xi" — villi? Se’, (5.20.2) 


for n> WN. In other words, (y — x”) is in X for n>WN. Since y =(y — x") + x", 
it follows that y e X. Furthermore, it follows from (5.20.2) that x" y in X. 

The last thing to do is show the connection between sums and direct sums. 
Suppose that M,, M,,... are mutually orthogonal, closed linear subspaces of a 
Hilbert space 7. Just as in Section 4.10, we want to compare My = M, + M,4+°°: 
and Mjpps = M, ®M,@::-. It goes without saying that these are different spaces. 
On the other hand, we shall see that they are unitarily equivalent. 


344 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Let W be the mapping of M, into Mp, defined as follows: We know from the 
Orthogonal Structure Theorem that each xe M, can be uniquely expressed as 
X =X, +X. +°'', where x,EM,;,i=1,2,.... We define w(x) = {x,,x,,x3,...}. 

It follows that w is linear and maps M, into Mps. Moreover, w preserves 
norms, and the range is all of Mp . Hence, wy is a unitary mapping and we have 
just proved the following result. 


5.20.3 THEOREM. If X is complete, the mapping YW: Ms — Mps discussed above 
is a unitary mapping of M, onto Mpys, and Ms and Mpg are unitarily equivalent. 


Because of this theorem, there is a widespread practice of referring to a sum of 
closed, mutually orthogonal linear subspaces of a Hilbert space as a direct sum. This 
is an innocent abuse of terminology, but the correct use should be understood. 


EXERCISES 


1. Extend the results of this section to uncountable sums. 


2. Show by example that Theorem 5.20.2 does not hold (in general) for inner 
product spaces that are not complete. Where is the completeness of H used in 
the proof of the Orthogonal Structure Theorem? 


21. CONTINUOUS LINEAR FUNCTIONALS 


The concept of linear functionals on a linear space was discussed in Section 
4.12. Continuous linear functionals on normed linear spaces were briefly introduced 
in Section 11 of this chapter. Now we want to discuss linear functionals on inner 
product spaces. The main result of this section will be to show that every continuous 
linear functional on a Hilbert space H (completeness is important) has a particularly 
simple representation. In other words, it is easy to “‘get our hands on”’ all thw 
continuous linear functionals. 
Let X be an inner product space and let y be a fixed element in X. Define a 
mapping / by 


6 


I(x) = (x,y). 


We claim that / is a bounded (that is, continuous) linear functional and ||/|| = |||, 
where ||/|| is the operator norm defined in Section 8. The linearity follows directly 
from the definition of inner product. Also, by the Schwarz Inequality we get 


(x)| = |x, y)| < M ||x\[, for all xe X, 


where M = ||y||. Hence |\/|| < ||yll. However, |/(y)| = |lyll yl] so I/I = ly. Thus, 
/I| = ly. 

We have shown, then, that a continuous linear functional / is naturally associ 
ated with each vector y in an inner product space. The extremely important fict 
about Hilbert spaces—as opposed to inner product spaces —is that this is true the 


5.21. CONTINUOUS LINEAR FUNCTIONALS 345 


other way around. That is, given any bounded linear functional / there exists a 
vector y such that 


I(x) = (x,y), for all x € H. 


5.21.1 THEOREM. (RIESZ REPRESENTATION THEOREM.) Let H be a Hilbert 
space and let | be a bounded linear functional on H. Then there is one and only one 
vector y € H such that 


I(x) =(x,y), forallxeH. 


The vector y 1s sometimes called the representation of 1. However, / and y are 
different objects, / a linear functional on H, and y a point in H. 


Proof: If !=0,then choose y = 0. Now assume that / 4 0 and let M = WV(J), 
the null space of /. Since /is linear, M is a linear subspace of H. Since / is continuous, 
M is closed. Furthermore, M # H, because / # 0. By Corollary 5.14.5, there is a 
nonzero vector z € H such that z L M. We can assume ||z|| = 1. We will now show 
that the desired vector y is given by y = az for some nonzero scalar a. 

Since z € M, one has /(z) 4 0. We shall now show that 


I(x) = (x,l(z)z) 


for all x € H. By the Projection Theorem (which requires completeness) one has 
H#= M+ M7, so every vector x in H can be uniquely written as x =m+n 
where me M and ne M+. We know from Theorem 4.12.2 that dim (M+) = 1. 
So we can write n = fz, that is, x = m + Bz, for some scalar B. See Figure 5.21.1. 


Z 


M=WN () 


Figure 5.21.1. 


Since ||z|| = 1, we have 
I(x) =I(m) + BI(z) =0+ BI(z)(z,z) 
= (m,I(z)z) + (Bz,I(z)z) 
= (m+ Bz (z)z) =(x,y), 
where y = I(z)z. 


346 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


The uniqueness of y follows from the fact that if (x,y) = (x,y’) for all x, then 
for X>» = y—y’ one has 


0=(xV-YI=O-yYiy-y), 
hence y=y’. J 


There is a variation of the Riesz Representation Theorem which is very useful 
in the study of differential operators, as shown in Section 7.8. We state it here and 
leave the proof as an exercise (see Exercise 10). 


5.21.2 THEOREM. (LAX-MILGRAM THEOREM.) Let B[u,v] be a sesquilinear 
functional on a Hilbert space H and assume that there are positive constants a and bh 
such that 

[Blue ]| < allel + ell 


b\jul\* < |BLuw]| 


for allu, vin H. Let 1 be any bounded linear functional on H. Then there exist unique 
points Ug and vy in H such that 


I(x) = BlLx,vo] = Blu .x] 
for all x in H. 


EXAMPLE 1. We have already discussed the basic Hilbert space approximation 
concept in Theorems 5.14.4, 5.16.5, and Lemma 5.17.14. In all cases, we are inter- 
ested in some closed linear subspace M of a Hilbert space H. The variations on this 
basic theme arise from the way we characterize the subspace M. Here we show how 
continuous linear functionals can be used. In particular, if /, is a continuous linear 
functional on H, then W(/,), the null space of /,, is a closed linear subspace of //. If 
/, is another continuous linear functional, then W(/,) A W(/,) is also a closed linear 
subspace. In general, if {/,} is a collection (perhaps uncountable) of continuous 
linear functionals, then 


M =(\ W(,) 
is a Closed linear subspace. 

Now if x, is a point in H, and if we want to find the unique point )) € M closest 
to X), we know from Theorem 5.14.4 that xo — yg is orthogonal to M. (See 
Figure 5.21.2.) 

Let us assume that M= W(L,) 0 V(1,) 0 °:: A WV(i,), that is, we are con- 
sidering a finite collection of functionals. We know from the Riesz Representation 


Theorem that for each /; there is a »;¢ H such that /(x) = (x,);) for all xe //, 
P= 1.2) ov We Hence, 


M = {x € Af: (x,);) = (x,)2) aoe os (X,Yn) =e O}. 


In other words, M = {y,,)2,..-,V,}'. Moreover, M+ is the linear subspace 


spanned by the set {y,,y2,.--,V,t. (Why?) Since (x9 — jy) € M*, it follows that 
(Xo — Yo) can be expressed as 


Xo ~ Vo = V1 a ae al ay Vas 


§.21. CONTINUOUS LINEAR FUNCTIONALS 347 


Ee 


M 


Figure 5.21.2 


where the a’s are scalars. Then since yy € M, we have 
(Xo =J7:0 J1) = (Xo V1) a 4109131) aie & (Yn J1) 


(Xo — Jo v5) — (Xo Da = 1091) n) Gg age n(n Vas 


Everything in the above system of equations is known except the x’s, so one merely 
solves for the «’s and, then, 


20> 0 SEM Fo a 


Let us note that there may be more than one solution for the «’s. This would occur 
if the collection {},,....,},} 1S not linearly independent. Jf 


EXAMPLE 2. The reasoning of Example | can be used if M is a hyperplane 
defined by functionals. For example. assume that a second-order time-varying linear 
system is modeled by the equations 


X(O)1 | Py i(to) Pi 2teto) | | ¥1(to) 1 bi(t.t) by 2(t.t)] | by 
ee + bere yale el peat as ip hapa Ase wl u(t) dt. 


Suppose that starting at (x,(to),X2(f9)) we want to choose the input wu so that we 
arrive at prescribed point (x,(7),x.(7)) at time 7. Moreover, suppose we want to 
choose the uv with the smallest L,-norm that does this. It follows that uw must satisfy 
the equations 


T 
[ hy(x)u(t) dt = A, (5.21.1) 
and 
T 
| h,(t)u(t) dt = Ap, (5.21.2) 
where 
h(t) — b;(T,t)b, + b(T,t)b>, P= l, 2 
and 


A, = X(T) — bi (Tilo) X\(lo) — DilTito)Xa(fo), b= 1, 2. 


348 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Figure 5.21.3 


Equations (5.21.1) and (5.21.2) define a hyperplane M in L,[?),7]. (See Figure 
5.21.3.) It follows that the input w of minimum norm can be written 


u=oah, +az,h, 
and 
(aihy + %2/2,/4) = A, 


(ah, + %2/2,h2) = Az, 
OFT 


| - I (hy ,h>) arte a 1 
Oo} (hyhy (Mo sha) — (hy ho)(ho hy) L—-Ctit2) (yy) J Ao] 


EXERCISES 


1. Let C[0,1] be considered as a linear subspace of L,[0,1], where L[0,1] has the 
usual inner product. Define /: C[0,1] — C by (x) = x(3). 
(a) Show that / is an unbounded linear functional. 
(b) Show that / is a bounded linear functional when C[0,1] has the sup-norm, 


2. Foreach fe L,[0,1] let d(t) be the solution of y’ + ay = fthat satisfies d(0) 0), 
where a is a constant. Define /: L,[0,1] > C by 


Kf) = | (0) at 


(a) Show that / is a bounded linear functional. 
(b) Find the representation of / as /(f) = (f,g). 

3. For each fe L,[0,1] let d(t) be the solution of y” + ay’ + by =f that satisties 
d(0) = o’(0) = O where a and 6 are constants. Define /: L,[0,1] > C by 


If) = | O(n at. 


(a) Show that / is a bounded linear functional. 

(b) Find the representation of / as [(f) = (fig). 

(Hint: Distinguish between the two cases where the roots of r? + ar + b = Oune 
the same or where they differ.] 


5.21. CONTINUOUS LINEAR FUNCTIONALS 349 


4. Consider the real inner product space R” with the inner product given by 
(x,y) =) Xj; Qijz Vj» 
t,J 
where A = (a;;) is a real symmetric n x n matrix that satisfies });; x;a;;x; > 0 
when x = (x,,...,X,) #0. Find a representation for /: R° > R when / is given 
by 
(a) K(x) = 1; 
(b) U(x) =x, + x2, 
(c) Wx) =)i21%:b;, (0; real). 
5. The time-varying network in Figure 5.21.4 satisfies the differential equation 


where R is the resistance, c(t) the capacitance, u(t) the input voltage, x(t) the 

charge density, and v(t) = x(t)/c(t). We assume that c(t) 1s positive and con- 

tinuous for 0 < ¢t < 1 and that x(0) = 0. 

(a) Show that the set of inputs u(t), with the property that ue L,[0,1] and 
v(1) = 1 satisfy /(u) = « for some functional / and some constant a. What is 
the representation of /? 

(b) Find the input u in L,[0,1] with the property that v(1) = 1 and 6 |u|? dt = 
minimum. 


u(t) c(t) v(t) 


Figure 5.21.4 


6. (Generalization of Exercise 5.) Consider the differential system 


dx 

— = Ax-+ Bu, (5.21.3) 
dt 

where x is a real n-vector, u a real m-vector, A a real n x n matrix and Ba 
real n x m matrix. Assume that there is at least one controller u(t) in L,[0,1] 
such that the corresponding solution of (5.21.3) satisfies 


x(0)=x° and x,(1) =a, (5.21.4) 
where x = (X1,X2,...5Xp,): 
(a) Show that the set of all u(t) in L,[0,1] that satisfy (5.21.4) can be represented 
by /(u) = B for some continuous linear functional / and some constant f. 
(b) Find the input u(r) in L,[0,1] such that (5.12.4) holds and 6 |lu||? dt = 
minimum, where |u|}? = Ju,|? + °°: + |u,l?. 


350 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


7, 


11. 


12. 


The Riesz Representation Theorem (Theorem 5.21.1) defines a mapping L of 
H' (the space of all bounded linear functionals on H) onto H. 

(a) Show that this mapping is one-to-one. 

(b) Show that L satisfies 


Lif+g=LY)+L9), (f,9¢H"') 
L(af) = aL(/), (fe H'’,aeEC). 


(c) Using the definition of norm from Section 8 on H’, show that for all fin H' 


one has ||L(f)|| = l/l. 


(d) Is L an isometric isomorphism ? 


. (Completion of an inner product space.) Let X be an inner product space. Show 


that the completion of X as a Banach space (compare with Exercise 7, Section 9) 
is a Hilbert space. [Hint: Let {x,} and {y,} be two Cauchy sequences in X. Show 
that 


{Xn} (Yah) = Lim (xy Yn) 


defines an inner product on the space of all Cauchy sequences.] 


. Locate the first point in the proof of the Riesz Representation Theorem where 


the completeness of the Hilbert space H is used. Are there any other places in 
this proof where completeness is used? 


. Prove the Lax-Milgram Theorem (Theorem 5.21.2). [Hint: For v fixed show that 


there is a y in H with B[x,v] = (x,y) for all x e H. Define A by y= Av, and 
show that A is a topological isomorphism mapping H onto itself. If (x) = (x,y), 
then /(x) = B[x, A~'y]. Now apply the same reasoning to B[u,x].] 

(a) Show that the Riesz Representation Theorem is valid for real Hilbert 


spaces. 
(b) What happens to Exercise 7 in the case of real Hilbert spaces? 


Let H be a Hilbert space of complex-valued functions defined on a set S. One 
says that H is a proper functional Hilbert space if for every se S the mapping 
o,: H>C given by 


(x) _ x(s) 


is a bounded linear functional on H. H is said to have a reproducing kernel 
K(t,s) if there is a complex-valued function K(t,s) defined on S x S with the 
properties: 
(i) For each t, the function of s, K(s,t) lies in H; and 
(ii) For each x e H and each te S one has x(t) = (x, K(, t)). 
(a) Show that H is a proper functional Hilbert space if and only if H has u 
reproducing kernel. 
(b) Assume that H has a reproducing kernel K(s,r) and let {x,} be an ortho- 
normal basis for H. Show that 


K(s,t) = ¥ x,(3)x,(0). 


13. 


14. 


5.21. CONTINUOUS LINEAR FUNCTIONALS 351 


Let H denote all functions f(z) that are analytic for |z| < 1 and such that 


[[ IF@P ax dy < 00, 


where D is the unit disk {z: |z| < 1} and z= x + iy. 
(a) Show that H is a Hilbert space when the inner product is given by 


(f.9) = |f f@a@ dx dy. 


(b) Show that H is a proper functional Hilbert space. (The reproducing kernel 
for this Hilbert space is called the Bergman kernel. See Goffman and 
Pedrick [1] and Nehari [1].) 


Let A be a subset of a Hilbert space H with the property that V(A), the closure 
of the span of A, is H. Let / be a bounded linear functional on H. Show that 
/is uniquely determined if one knows /(a) for all ae A. 


Part C 
Special Operators 


22. THE ADJOINT OPERATOR 


It is appropriate here to state what our major objective will be in the next 
chapter. Suppose that K is a bounded linear operator on a Hilbert space H. For 
example, K might be given by 


(Ky\(t) = i k(t,t)x(t) dt 


on L,(/). We wish to get a better geometric picture of the behavior of K. In many 
important cases it turns out that the restrictions of K to certain linear subspaces are 
particularly simple operators. (For example, the restriction of K to its null space is 
the zero operator.) By piecing together these subspaces and simple operators in a 
suitable manner, we get a global picture of the operator K. However, before we can 
start any serious analysis of linear operators on Hilbert spaces it is absolutely 
necessary that we have the concept of the adjoint operator. 

Let K: H > H bea bounded, linear operator on a Hilbert space H, and let y bo 
a fixed element in H. Now consider the form 


I(x) = (Kx,y). 
The mapping /is obviously a linear functional on H. Furthermore, from the Schwarz, 


Inequality we have |/(x)| = |(Kx,y)| < || Ky|| ||x||, and by the definition of the opera- 
tor norm this becomes 


(x)| < WAT yl Mad, 


that is, / is bounded. Therefore, by the Riesz Representation Theorem there ts u 
unique y* in H such that 
(Kx,y) = (x,y*) (5.22.1) 


for all x €e H. Thus, given a y € A there is a unique y* associated with it. In other 
words, we have a mapping of H into itself. 


5.22.1 DEFINITION. Let K*: H— 4H be the mapping defined by (5.22.1) so 
that y* = K*y. K* is referred to as the adjoint of K. 
It follows from this definition that 
(Kx,y) = (x,K*y) (5.22.2) 


for all x, ye H. Moreover, K* is the only mapping satisfying (5.22.2). Indeed, if 
K, and K, are two transformations satisfying (5.22.2), 


(x,K,y) as (x,K, y) 
for all x, ye H. It follows that K, = K,. (Why?) 


352 


5.22. THE ADJOINT OPERATOR 353 


We leave it to the reader to show the following: 


PP =. 

0* = 0, 
(S+7)* = S*+T*, 
(aT)* = aT*, 
(ST)* = T*S*. 


In addition, we have the following facts about the adjoint. 


5.22.2 THEOREM. Let K: H-H be a bounded linear operator on a Hilbert 
space H. Then the adjoint operator K* is a bounded, linear operator and ||K|| = \|K*|. 
Moreover, (K*)* = K. 


First Part of Proof: First we note that 


(x,K*(a,y, + %2V2)) = (Kx,01y1 + &2 y2) 
= &1(Kx,y1) + &(Kx,y2) 
= &(x,K*y,) + &2(x,K*y2) 
= (x,0,K*y, + %, K*y,), 


which shows that K* is linear. 
We will now show that ||K*|| < ||K||. Note that 


||K*y||? = (K*y,K*y) = (KK*y, y) 
< ||KK*y|| - lly] s TAI: |A*yll > lly. 
Hence ||K*y|| < ||K|| - [yl], which implies that 
|K*|| < |All. 


If we replace K by K* in Equation (5.22.2) this allows us to define the second 
udjoint K**. That is, 


(K*x,y) = (x,K**y). 
Since 
(x,Ky) = (Ky,x) = (y,K*x) = (K*x,y) 
for all x and y, it follows that 
(x,Ky) = (x,K**y) 
for all x and y. But this implies that K** = K. 


We can now complete the proof of Theorem 5.22.2. 


Second Part of Proof: We have shown that ||K*|| < ||K||. Nowifwe replace K 
by K* we get ||K|| = || K**|| < ||K*||. Hence ||K|| = |K*]. 


354 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 
5.22.3 COROLLARY. ||K*K|| = ||KK*]| = \|.K ||? = \|.K * ||? 


Proof: It follows from Theorem 5.8.4 and the foregoing theorem that 
IK*K || < ||K*|| |K]| = KI IK] = ||K1? = |K*|?. (5.22.3) 
Furthermore, 
|| Kx||? = (Kx,Kx) = (K*Kx,x) < ||K*Kxl| ||x|] < |K*K'|| lll’, 
so we have 
\K |? < |K*K|. (5.22.4) 
Combining (5.22.3) and (5.22.4) yields the equality || K*K || = ||K ||*. The remainder 


of the theorem is proved in an obvious way. J 


EXAMPLE 1. Let H = C” with the inner product given by 
(x,y) oe 2 yj : 


Let K: C" > C" bea linear operator that is represented by ann x n matrix of complex 
coefficients (k;;). That is, if y = ()4,...,y,) and x =(x,,...,x,), then =! 


y= Kxey= Y kyx Xj5 b= leon 


What is the adjoint K*? Assume that K* is represented by the n x n matrix with 
coefficients (/;,). The equation (Kx,y) = (x,K*y) then becomes 


§ (Sus)n~ Sx(Eao} 


i=1 


Since 


ki XY 2 2 tei Xjy =e jd kis Vi 


a Na 


= > “AD bax = = » “(SF k,9,) 
J L L 
(where in the last step we merely change the summation indices), it follows that 
l;; = k;,. That is, the matrix for K* is found by (a) taking the transpose of (k;,) and 
(b) taking the complex conjugate of each entry of (k,;). JJ 


EXAMPLE 2. Let H = R" with the inner product given by (x,y) = )7-1.%))'. 
The adjoint of a matrix operator is the transpose. J 


EXAMPLE 3. Let J be an interval and let k: J x I-C be such that 


f [1k(soP ds dt < 0. 


5.22. THE ADJOINT OPERATOR 355 
Define K: L,(1) > L,(1) by z = Kx where 
z(s) = [ k(s,)x( dt, 
I 
and assume that L,(/) has the usual inner product 
(x,y) = [ x5 dt. 
I 


We will show that K* is also an integral operator. In this case we get 


(Kx,y) = j J K(sx(0 at) ds = j [ Ks.x(0y6) ds dt 


= {x [| k(s,t)y(s) as dt = (x,K*y). 
I I 
Hence, after interchanging the s and ¢ variables, we get 
(K*y\(s) = | (e.s)y(0) dt = | (s,DyO at, 
I I 


that is, k*(s,t) = k(t,s). Jj 


EXAMPLE 4. The foregoing example is particularly interesting when K repre- 
sents a Volterra integral operator 


y(t) = [ koa dt. (5.22.5) 


Here we take J =[0,7]. If we set k(t,t) = 0 for t < t, then (5.22.5) becomes 


y(t) = J atx) dt. 


According to the last example the adjoint is given by 


(K*y)(t) = [ DO dt = j k(t.) y(t) dr. (5.22.6) 


In other words, the adjoint of a Volterra integral operator is also a Volterra integral 
operator. However, if K depends on the “ past,” then K* depends on the “ future.” 

The Volterra integral operator in (5.22.5) is a particular example of a causal 
operator, and (5.22.6) is an example of an anticausal operator (see Example 7). J 


EXAMPLE 5. On L,(— 00,0) consider the multiplication operator 


F: x(t) > f(t)x(t), 


where | f(t)| < B < oo for almost all t. F is a bounded linear operator and it is easy 
to show that 


|| F'l| = ess. Pau IFCOL = lle - 
te 


356 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Furthermore, the adjoint mapping is given by 


F*: y(t) > f(Dy(2). 
Indeed 


(Fxy) =f fOx yO at = [- x(QFOy at 


= (x,F*y). | 


If T: X > X isa lineartransformationand M is a linear subspace of X suchthat 
T(M) c M, we say that M is invariant under T. In this case the restriction of Tto M 
is a mapping of M into itself. 


EXAMPLE 6. Consider the space L,(— 00,00) and let J be a interval in R. Let 
M = {xeEL,(—0,0): x(t) = 0 for té J}. 
If Fis a multiplication operator 
F: x(t) > f(t)x(t), 
where ||/||,, < 0, then F(M) cM. J 


EXAMPLE7. Inthisexample we show that if L is causal, then L* is anticausal. 

Let H = L,(— 0,00) and let L be a bounded linear mapping of 4H into itself. We 

know from Example 4, Section 4.3, that Z is causal if and only if each linear sub- 

space H, is invariant under L, where H; is the linear subspace of H made up of all 

functions x such that x(t) = 0 (almost everywhere) for t< 7. Let P;: HH be 
defined by 

x(t), forT<t<o 
se : for —0 <t<T. 


Obviously P; is an orthogonal projection with A(P,) = H,. Furthermore we seo 
that P;* = P,. | 

We claim that H; is invariant under L if and only if LP; = P; LP,;. To prove 
this first assume that L(H,) <c H,. Then LP;x is a point in H, for any xe H. But 
the restriction of a projection to its range is the identity operator on the range, so 
UP,x = P,LP,x for all xe H. 

Next we assume that LP; = P;LP,. Since &(P;) = Hz, the linear subspace 
H, is exactly the set of all points of the form P;x with x e H. Thus the set of all 
points of the form LP, x with x € H is L(A,). Since LP; = P;LP7,, it follows that 
y =P,y for all y e L(A,). In other words, L(H,) c &(P,) = Ar. 

If we let QO; = J — P,, then it can be shown in a manner similar to the above 
argument that L* is anticausal if and only if L*O, = Q,L*Q, for each T. 

Assuming, then, that LP; = P,;LP, for each T, we have 


(y,LP,x) =(y,P,LP,x), for all x, ye H. 
So 
(P, L*y,x) = (P7L*P; y,x). 


5.22. THE ADJOINT OPERATOR 357 


Then using P; = I — Q,, we eventually obtain 


0 =(Q,L*Or7y,x) — (L*Qry,x) 


for all x, ye H. Hence, 0, L*Q,;, = L*Q, showing that L* is anticausal. A simple 
reversal of the above steps shows that if L* is anticausal then L is causal. J 


Suppose T is linear transformation of a Hilbert space H into itself. Sometimes 
it happens that a closed linear space M and its orthogonal complement M°~ are both 
invariant under 7, that is, T7(7) < M and T(M~*) < M*. When this happens we 
say that M reduces T. We say that M “‘reduces’’ T because then T is completely 
characterized by its restrictions to M and M°. It often happens that these restric- 
tions of T are simpler than T itself. 

Going further, we sometimes can find a family of closed mutually orthogonal 
subspaces {M,} such that H = M, + M, + M,+->°-, in the sense of Section 20, 
such that each M, reduces T, and such that the restriction of T to each M, is a 
simple operator. This idea is developed further in the next chapter. 


5.22.4 THEOREM. Let T be a continuous linear transformation of a Hilbert space 
H into itself. A closed linear subspace M of H is invariant under T if and only if M+ 
is invariant under T*. 


Proof: Suppose M is invariant under 7, that is, T(/) < M. Then (y,Tx) = 0 
for all ye M* and xe M. Thus (T*y,x) =0 for all ye M* and xe M. Hence 
T*y e M+ forall ye M+, which shows that M7~+ is invariant under T*. Also if M+ 
is invariant under 7*, essentially the same argument shows that M is invariant 
under 7. J 


5.22.5 COROLLARY. A closed linear subspace M of H reduces T if and only if 
M is invariant under both T and T*. 


Proof: One merely recalls [Theorem 5.15.4(b)] that M*++ = M and T** =T. 
The proof is then trivial. J 


The preceding two results (Theorem 5.22.4 and Corollary 5.22.5) are our first 
use of the adjoint. They are by no means our last. The next theorem shows that 
T* can be used to characterize the range and null space of T and vice versa. 


5.22.6 THEOREM. Let T be a bounded linear transformation of a Hilbert space 
A into itself. Then 
A(T) = {N(T*)}* (5.22.7) 
and 
{W(T)}* = A(T), 


where N(T) and N(T*) are the null spaces of T and T*, respectively; and @(T) and 
R(T) are the closures of the ranges of T and T*, respectively. 


358 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Proof: Since T** = T it will suffice to prove (5.22.7). A point ze A is in 
A(T)* if and only if (z,T7x) = 0 for all x ¢ H. But by definition of the adjoint, it 
follows 


(T *z,x) =0 


for all xe H. So ze {A(T)}* if and only if 7*z = 0. Hence, we have shown that 
{A(T)}* = N(T*). 

However, now &(T) may not be closed, so it is not necessarily the case that 
{A(T)}1+ = A(T); but, we do always have that {A(T)}++ = A(T), which proves 
(5.22.7). Jj 


The concept of the adjoint mapping offers an elegant way of characterizing 
unitary operators. 


5.22.7 THEOREM. Let U: H>H be a bounded linear operator on a Hilbert 
space H. Then U is a unitary operator if and only if UU* = U*U = I, that is, if and 
only if U* =U". 


Proof: If Uisa unitary mapping, then 
(U*Ux,y) = (Ux, Uy) = (x,y), 


for all x and y in H. Hence U*U = J. Similarly, we see that UU* = J. 
Now assume that U*U = I. Then 


(x,y) aioe (U*Ux,y) = (Ux,Uy), 
for all x and yin H, so Uis unitary. J 


EXAMPLE 8. Let H = L,(—io0,ioo), and let T denote the linear transformu- 
tion of H into itself defined by Y = TX where 


1 
Then the adjoint of T can be represented by the equation 


Z(iw) = = Wi), iw € (—i00,i00). 
1 — iw 

Since the (transfer) function is nonzero for all iw, the null space of T*, W(7'*), in 

the trivial subspace {0}. Hence, {/(T*)}+ = H, and from Theorem 5.22.6 it 

follows that RT) = H. Of course, in this simple case we can see directly that this In 

so. Indeed, 


1 
R(T) ={YedH: Y(i@) _ fia X (i@) with X EH 


which is a subspace that is dense in H. (Why?) On the other hand, one does have 
AT)/#AH. | 


5.22. THE ADJOINT OPERATOR 359 


EXAMPLE 9. Theorem 5.22.6can be used in approximation. Recall that Theo- 
rem 5.14.4 is the key result concerning approximation in Hilbert spaces. Now it 
often happens that the closed linear subspace M containing the approximation yo 
is specified as the null space or range of a bounded linear transformation. Let us 
briefly consider these two cases of this approximation problem. 


Casel: M=WN(T). 

Suppose that T is a bounded linear transformation of a Hilbert space H into 
itself. Then M = W(T) is a closed linear subspace of H. If x, € H, then (by Theorem 
5.14.4) there is a unique yo € M such that ||x9 — yoll < |lxo — y|| for all ye M and 
Xo — Yo € M1. It follows from Theorem 5.22.6 that (x9 — yo) isin A(T*). Of course, 
if A(T *) is closed, then there exists a Z) € H such that x9 — yp = 7 *z,.. However, 
we cannot assume in general that @(T*) is closed. Hence, all we can say for sure is 
that there exists a sequence {z,} in H such that 


Xo — Yo = lim T*z,. 


In order to simplify the discussion, let us assume that there is a z) e H such that 
Xo — Yo =T*Zq. The point yo is in the subspace M = W(T) if and only if 

TX =TT*Zp ° (5.22.8) 
So our problem reduces to finding a Z, that satisfies (5.22.8). Then given such a 
Zo We CAN USE Wo = Xo — T'*Zp to get Vo. 


CaselI: M=R2(T). 
Here again it may be that &(7T) is not closed. If it is not, we can say that there 
exists a sequence {z,} in H such that 


lim ||x9 — Tz, || = dist (x9 ,M). 


n- oo 


Again to simplify matters let us assume that there is a z) € H such that 
|Xo — TZol| S ||Xo — Tz] 
for all ze H. Then (xp — Tz,)) 1 M, so 
0 = (Xp — TZ) ,7Z) = (T*x9 — T*TZy,2Z), 
for all ze H. It follows that 
T= 1 7125. (5.22.9) 
Given, then, a Zz, that satisfies (5.22.9) the approximation yp to xq is given by 
Yo= Tz. I 
EXAMPLE 10. Let H = L,(—00,00),and consider the delay or shift S,: H > H, 
where (S_x)(t) = x(t — t) for te (— 0,00). It is clear that S, has a continuous in- 
verse; indeed, S,-' = S_,. In fact S, is a unitary operator, for 
(S.xS,y)=f ye ax(t— ddr = | 


oO 


y(t)x(t) dt = (x,y), 


360 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


for all x, ye H. Moreover, 


(S.xy)= Jf yOxr—a)dtr= [ye Dx(0) dt, 
for all x, y € H, so the adjoint of S, is defined by 
(S,*y)(t) = y(t + 1), for t €(— 00,00). 


Hence, 
S = S-22.5. "5 (5.22.10) 


EXAMPLE 11. In this example we show that the adjoint of a linear time-in- 
variant operator is also time-invariant. Let L be a time-invariant continuous linear 
transformation of L,(— 00,00) into itself. Recall that L being time-invariant means 


S b= LS., for all —c0 <t< 0, 
where S, denotes the delay or shift (S,x)(t) = x(t — t). Then, using (5.22.10), we get 
(x,L*S,y) = (Lx,S,y) = (S_,Lx,y) 
and 
(x,S,L*y) = (S_,x,L*y) = (LS_,x,y) = (S_,Lx,y) 


for all x, y € L,(— 0,00). Note that the last interchange is a consequence of L being 
time-invariant. In any event, we have (x, L*S,y) =(x, S,L*y) for all x, yin 
L,(— 00,00), so L*S, = S,L*; therefore, L* is time-invariant. J 


EXAMPLE 12. (FOURIER TRANSFORM ON R?.) In this example we shall prove 
that the Fourier transform F, as defined ig (compare with Example 6, Section 19) 


f(y) = (FAY) = sone (x) dx, (5.22.11) 


is a unitary mapping of L,(— 00,00) onto itself and that the inverse is given by 


f(x) = (F7'f\(x) = Pa is e*F(y) dy. (5.22.12) 


Actually the representation for F and F ~' given by (5.22.11) and (5.22.12) is valid 
for functions f and f in L,(— 00,00) A L,(— 0,00). In order to discuss arbitrary 
functions in L,(— 00,00) we need a different representation, namely, 


70) = ENO) = Ea of hod 2219 
and 
d pe elt 1 
f(x) = (FoF) = i log oe (5.22.14) 


5.22. THE ADJOINT OPERATOR 36] 


If f and f belong to L,(— 00,00) then we can bring the differentiation inside the 
integral and then (5.22.13) and (5.22.14) reduce to (5.22.11) and (5.22.12). The 
transformation F represented by (5.22.13) is oftentimes referred to as the Fourier- 
Plancherel transform. 

Let us now show that F and F~! are unitary operators on L,(— 00,00). For 
this purpose we shall denote the operator defined by (5.22.14) as G. We will then 
show that F and G are unitary and that F* = G, which implies that F~1 = G. 

Now define 

T~UxXy ex = 
e 1 Keane 1 


] 
Jan — ix” = iy 


A(y,x) = 


and let 


+1, O<x<r, Or, 
$,(x) = ==), r<x<0O, r<0, 
0, otherwise. 


Now for r > 0 one has 


d pe—1 tt shes 
(F9,)(y) = = 5) = i fertigs 
= H(r,y). 


Similarly for r < 0 one has (F¢,)(y) = H(r,y). Likewise we get (Gd,)(x) = K(r,x). 
Since $(y) = Im A(r,y)H(s,y) is an odd function in L,(—0,00), one has 
eo Py) dy = 0. Hence 


2 ——_— 1 ¢® cos(s—r)y —cossy —cosry+1 
(F6,,F.)=[ HO. )HGy) dy == [_ —eeeeee dy. 


By using the trigonometric identity cos @ = 1 — 2 sin* 6/2 and by changing variables 
we get 


sin? u 


(Fo, .F$,) = — tir + |s| — Ir —s}} { du. 


© sin? u 


Since | du = 7, we get 


0 u 


(Fp, Fh) = [ur ss oe = este) 


Similarly one has 


(Gd, ,GQs) = (9, Ps). 


Furthermore by a simple change of variables we get 


(FO, 9s) cs (9, :1Gq,). 


362 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


If fand g are now finite linear combinations of the functions @,, that is, if they are 
step-functions, then one has 


(Ff,Fg) = (49), 
(Gf,Gg) = (L9), (5.22.15) 
(FF,9) = (f,G9). 


However the step functions are dense in L,(— 00,00), therefore (5.22.15) is valid for 

all fand g in L,(— 00,0). This shows that F and G are unitary and that F* = G. 
Next, one can prove that the Fourier transform F is given by (5.22.11) for 

all fe L,(— 00,00) if one interprets the integral in (5.22.11) in the following sensc: 


l 
Zit 
where /im means “limit in the mean,” that is, 
1 /X , 
f]70)-e fh emt) ax 
af 2te* aN 


as N -— oo. You are asked to prove this in Exercise 13 below. 


—iyx ee ( i . —iyx 
je ee Dea) 2 **f'(x) dx, 


2 
dy—>0 


The reader is probably aware of the importance of the Fourier transform. It is 
a fundamental tool in the operational calculus of differential operators. The 
following theorem is the cornerstone of this theory. 


5.22.8 THEOREM. Let P and QO be the linear operators defined by 
du 
P: | — 
u(x) >i Tx 
Q: u(x) > xu(x), 
where the domains are 


Dp = {ue L,(— 0,00): uw is absolutely continuous and uw’ e L,(—00,00)} 
Dg = {ue L2(— 0,00}: xu(x) € L,(—00,00)}. 


Then the Fourier transform F sets up a one-to-one correspondence between Dp and 'y 
in such a way that 


P=FQF' and Q=F™!PFE. 


Proof: ‘The first step is to show that if ue Dg, then Fue Dp and PFu = FQu 
Let ue Dg, then J®,, |u(x)| dx < 00, by Exercise 14 below. Thus v = Fu is given by 


oy) = (Fuly) =—— fe u(x) dx. 


/2n a 


5.22. THE ADJOINT OPERATOR 363 


However, FQu € L,(— 00,00) and 


—iyx 


(FQu\(y) = ap [° = * nuee dx 


— 1X 


—= ae ~* _ 1)u(x) dx 


. d 1 = —iyx 
= ahs u(x) dx 
= (PFu)(y). 


Hence Fu € Dp and FQu = PFu. 
The second step is to show that if ve Dp,then F~'v €e Dy and F-'Pv = QF“! 
Let v € Dp, then 
lim v(x) = 
x7to 
by Exercise 15 below. Furthermore F ‘Pv is in L,(— 00,00) and if we integrate by 
parts we get 
oO gitY _ | , duly) 


F-'!Pv\(x 
(F~'Pv)(x) = -- i i= a 
— 2 eixy) — (e” — 1) 
sae ~ { ey ee = aay 
Tee y 
d x p® ey 
= — —_ d 
a Dy * v(9) y+ =p a = u(y) dy. 
Since 
ao gixy _ | 1 © gkXY _ | — ix 
aa J iy ——— wy) dy = - ayes. ae) dy, 
an = 0 
we get 
d 1 Oh gies | a 
(F~ Se Ee ae ON) dy =x(F 1y)\(x). 
T*-o@ 


Hence F~'ve Dg and F-'Pu=QF'v. J 


EXAMPLE 13. (FOURIER TRANSFORM ON R".) On R" the Fourier transform 
takes on the form (compare with Example 6, Section 19) 


Fy) = (FIM) = |e PFC) ax (5.22.16) 
| a 1 ix-yf 
I) (FN) = Ts Je 70) ay, (5.22.17) 
where x = (x,,...,X,) and y = (y,,...,y,) are points in R” and 


XV SHRMVy ti xan: 


364 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Equations (5.22.16) and (5.22.17) are valid for fand f in L,(R") 0 L,(R"), and they 
are also valid for all f and f in L,(R") provided that we compute the integral as a 
limit in the mean of integrals over bounded regions. 

One can prove that 


(FIFG) = (22)"Fh9) 


for all f, g in L,(R"). This is done by applying the Fourier transform on R’ to each 
of the variables x,, x,,..., X, successively. Similarly one gets 


(F "fF ~*'g) = (22) "(L9). 


By the same method one can prove the following theorem. 


5.22.9 THEOREM. Let P, and QO, be the linear operators defined by 


_ Ou 
Pe U(X 419+ ++ 9%q) > EB 
Xk 
O12 U(X 45. - + Xp) > X,_U(X4,-- Xp), 


where the domains are 
Dp, = {ue L,(R"): P,u € L2(R")} 
Bo, = {u E L,(R"): O,.u E L,(R")}. 


Then the Fourier transform ¥ sets up a one-to-one correspondence between Dp and 
Bg in such a way that 


P,=F%O,F ' and O, =F 'P,F. 
As a corollary to this we can prove that if ue L,(R") with 


D‘u = —————_—-€ L,,(R”, 
u Ox! Saiees ax,7" E 2( ) 
then 
(F D*u)(y) = (iy)"aQy), (5.22.14) 


where Gi = Fuand y*=y,' +--+ yy," | 


EXERCISES 


1. For each x E€L,[0,1] let y = Tx be the solution of y’ + ay = x that satislics 
y(0) = 0, where a is a constant. Determine the adjoint 7*. 


2. For each x in L,[0,1] let y = Tx be the solution of y” + ay’ + by = x that 
satisfies y(0) = y(1) = 0, where a and b are constants. Determine 7*. Is it evet 
true that T= T*? 


5.22. THE ADJOINT OPERATOR 365 


. Let L be a bounded linear operator on a Hilbert space H. Verify the following 


relationships: 
N(L*) = N(LL*); AL) = ALL*). 


. Use Theorem 5.22.6 to show that &(L) = H if and only if L* has a continuous 


inverse. [Hint: Use also the Closed Graph Theorem, Exercise 13, Section 8.] 


. Let L: H—> AH be a topological isomorphism. Show that (L~')* = (L*)7?. 
. Let H be a Hilbert space and consider * as a mapping of Bit(H,H] into itself 


where *: L > L*. Show that * is one-to-one and its range is all of Bt H,H]. 
Is * linear ? Does * preserve norms ? (Compare this with Exercise 7, Section 21.) 


. The adjoint was defined for operators on a Hilbert space. Can this be extended 


to operators on an (incomplete) inner product space? If not, why not? 


. LetT:1, — 1, be given by T(x,,x2,...) = (%1,4%2,...,(1/n)x,,...). Determine 7 *. 


. Let L: H- H be a bounded linear mapping of H onto H. Show that L is an 


isometry if and only if L* is an isometry. 


. Define a mapping L: /, > 1, by (),,y2,...) = L(x1,X2,...) where 


pe Ay 


no 2 
n 
Show that L is a bounded linear operator with || L|| < (2% ,1 /n7)'/?. What is 


Lb”) 


. Define a mapping A: /, > 1, by ()1,y2,...) = A(X1,X2,...) where 


n 
Vn = ys Anji Xj 
J= 


Assume that A is a time-invariant operator and )'* , |a,,|? < oo. Show that A 
is a bounded linear operator with |A|| = (O°, |a,1|7)'/?. What is A*? 


. Show that if a sequence {L,,} converges uniformly to L, where the L,,’s and L are 


bounded linear transformations on a Hilbert space H, then the sequence 
{L,,*} converges uniformly to L*. 


. Let f be the Fourier transform of f given by (5.19.3). Show that ||f — gy|| + 0 


where 
(== fe ™ f(a 
GN) = xX) ax. 
- SPL SN 
. Show that if ue Dg, then J@,, |u(x)| dx < 00. [Hint: Let v(x) = xu(x), then 


v € L,(— 00,00) and by Schwarz Inequality 
1 
u(x) = ‘ v(x) € L,(—00,—-1) NL ,(1,0). 


Since ue L,(—1,1) one has u € L,(— ©, 00).] 


366 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


15. Show that if v € Dp, then 


lim v(x) = 0. 


x7>+t0 


[Hint: Note that 


x d x iat S x pore 
los)? — Ie(O)I? = JF oO)? dy = | 062) dy + J v0) ay. 


Now use the fact that v and v’ are in L,(— 0,00) to show that the limits as 
x — +00 exist.] 


16. Prové Theorem 5.22.9. 
17. Prove Equation (5.22.18). 
18. Let g = Sf be defined by 


2 px 
aw = I, sin xy f(y) dy. 


Show that S is a unitary mapping of L,[0,0co) onto itself and that S~! = S. 
19, Let g = Cf be defined by 


2 px 
as) =)? i: cos xy f(y) dy. 


Show that C is a unitary mapping of L,[0,co) onto itself and that C7’ = C. 
20. (Hankel Transform.) Let g = Hf be defined by 


acs) = | Gy) oxy) 0) dy, 


where J, is the Bessel function of the first kind of order v, and where v > - §, 
Show that H is a unitary mapping on L,[0,co) with H~! = H. (See Stone 
[1, p. 110] for more details.) 


21. (Watson Transform.) Let w(r) be a function with the property that r~*w(r) in 
in L,(a,b) and 


i" w(sx)w(tx) de a {min {|s|, |t]}, ifst>0 
F x? ~ 10, if st < 0. 


Define the Watson transform g = Wf by 


d b 
g(x) == f 2 py ay, 


dx y 
(a) Show that W is a unitary mapping of L,(a,b) onto itself and that W ' in 
given by 
dp? w(xy) 
fo) =F, | = 9G) ax. 


5.23. NORMAL AND SELF-ADJOINT OPERATOR 367 


[ Hint: Study Example 12 carefully. ] 
(b) Show that the Watson transform is a generalization of the Fourier- 
Plancherel transform. 


22. Let L: C? + C? be represented by the matrix 


i —1+2i 
© 4 SD 
b 1+i 1-i 

4 Jip 
—1 2—i 
c seats 


ja JB 
Determine a, b, and c so that L is unitary. (Assume that C° has the usual inner 
product.) 


23. Let H, and H, be Hilbert spaces with inner products (-, +), and (-, +). respec- 
tively. Let L: H, + H, be a bounded linear operator. Define the adjoint 
L*: H, > H, so that 


(y,Lx). = (L*y,x), 


holds for all x in A, and all yin H,. 

(a) Show that L* is a bounded linear operator. 

(b) Show that L = L**, 

(c) Show that ||Z|| = ||L*]. 

(d) Show that a bounded linear operator U: H, > H, is unitary if and only if 
U*U =I/, and UU* =1,, where J, and /, are the identity operators on 
H, and H,, respectively. 


23. NORMAL AND SELF-ADJOINT OPERATORS 


This section is devoted to the elementary properties of normal and self-adjoint 
operators. 


5.23.1 DEFINITION. Let H bea Hilbert space, and let T: H > H be a bounded 
linear transformation. T is said to be normal if TT * = T*T, that is, if T commutes 
with its adjoint. 


5.23.2 DEFINITION. Let H be a Hilbert space, and let T: H — H be a bounded 
linear transformation, T is said to be self-adjoint if T= T*. 


Obviously another characteristic of self-adjointness is given by the relationship 
(Tx,y) = (x,Ty) for all x, y in H. 


5.23.3 LEMMA. Jf T is self-adjoint, then it is normal. 


368 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Proof: The proof of this lemma is trivial. J 


We have discussed a number of classes of operators so far. Their relation to 
normal and self-adjoint operators is illustrated in Figure 5.23.1. 

Normal and self-adjoint operators are important for at least two reasons. First, 
many physical systems—but not all—can be modeled mathematically using these 
operators. Secondly, normal and self-adjoint operators have an especially simple 
structure. This latter point is not self-evident. Indeed, its explanation is the subject 
of the next chapter. 


Normal Operators 


Self-Adjoint 
Operators 


Orthogonal 
Projections 


Continuous 
Linear 
Operators 


Unitary 
Operators 


Figure 5.23.1. Space of Linear Operators. 


Self-Adjoint Operators 


Let us begin with some examples of self-adjoint operators. 


EXAMPLE 1. Let H = C” be the space of ordered n-tuples of complex numbers 
with the usual inner product. Let T be the linear transformation represented by the 
matrix [7]. T* is the linear transformation represented by T", that is, the complex 
conjugate of the transpose. Therefore, T is self-adjoint if and only if [T] = [T]', that 
is, if and only if [7] is a Hermitian matrix. J 


EXAMPLE 2. The operator T on H defined by Tx = ax, where « is a scalar, In 
self-adjoint if and only ifaisreal. J 


5.23. NORMAL AND SELF-ADJOINT OPERATOR 369 
EXAMPLE 3. Consider the integral operator K given by 
y(t) = | k(t,s)x(s) ds 
I 


or L(1) where f; f; |k(t,s)|* ds dt < oo. It follows from Example 3, Section 22 that 
K is self-adjoint if and only if k(t,s) = k(s,t). J 


If A and B are self-adjoint operators on H, then so is A + B. Similarly, «A, 
where « is real, is self-adjoint whenever A is. So the set of all self-adjoint operators 
on H forms a real normed linear space. The norm 1s, of course, the operator norm. 
Note that this real linear space is not a linear subspace of the complex space 
BitLH,H], the space of bounded linear operators on H. However, we can say the 
following. 


5.23.4 THEOREM. The set of all self-adjoint operators on H is a closed set in 
BItLH,A ). 


Proof: Let {L,} be a sequence of self-adjoint operators with ||L, — L|| > 0, 
where L is a bounded linear operator on H. We want to show that 


(Lx,y) = (x,Ly) 
for all x, y in H. Since L, is self-adjoint we get 
\(Lx,y) — (x,Ly)| = (Lx,y) — La x.y) + (L,Y) — OLy)| 
= |((L — L,)x,y) + (x, (ZL, — L)y)| 
<2|L — L,I |lxll - ly|| + 0, as n> oo. 
Hence L is self-adjoint. J 


It is not true that if A and B are self-adjoint, then AB is self-adjoint. However, 
one can say the following: 


5.23.5 THEOREM. If A and B are self-adjoint operators on a Hilbert space H, 
then AB is self-adjoint if and only if AB = BA. 


We leave the proof of this theorem as an exercise. 


If T is self-adjoint, then 
(x,Tx) = (Tx,x) = (x,Tx) 
for all x € H. In other words, (x,7x) is real-valued on H. Actually this fact is one 


characterization of self-adjoint operators. 


5.23.6 THEOREM. Let T: H-—H be a bounded linear operator on a Hilbert 
space H. T is self-adjoint if and only if (x,Tx) is real-valued for all x in H. 


370 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Proof: In light of the comments above we need only show that (x,Tx) real 
implies that T is self-adjoint. Suppose 


(x,Tx) = (x,Tx) 
for all x e H. It follows that 
(x,Tx) = (Tx,x) 
for all x e H. Since 
A(x,Ty) =(x + y, T(x + y)) -—(x— y, TX — y)) 
+ i(x + iy, T(x + iy)) — i(x — iy, T(x — iy)) 
= (T(x + y), x+y) —(T@ — y), x — y) 
+ i(T(x + iy), x + iy) — (T(x — iy), x — iy) 
= 4(Tx,y), 


it follows that T=T*. Jj 
EXAMPLE 4. Let H =L,(—ioo,ioo) and consider a linear time-invariant 
system whose transfer function is T(iw). The equation 
Y (iw) = T(iw)X (io), iw € (—i00,i00), 
models the system in the frequency domain. The operator is self-adjoint if and only 
if T(iw) is real for (almost) allio. f 


The last theorem leads to a method of ordering self-adjoint operators. 


5.23.7 DEFINITION. A bounded linear self-adjoint operator T on a Hilbort 
space H is said to be positive if (x,Tx) = 0 for all x in H. We denote this by T > 0) 
or 0 < T. It is strictly positive if (x,Tx) > 0 for all x 4 0. We shall denote this by 
T>Oor0<T. 


We shall, then, write 4 < B if 0 < B— A. We leave it to the reader to show 
that ‘‘ < ” does indeed define a partial ordering on the set of all self-adjoint opor 
ator on H. Similarly, we write A < Bif0 < B— A. 

Recall that the norm of T is given by any one of the following: 

|Z | = sup{||7~||: lx] = 1}. 
|Z || = sup{ 7x]: ||xl] < 1}, 
Tx 
|| 7|| = sup (ot x # of 
III 
|7' || = inf{B: ||7x|]-< Bl|x|| for all x}. 


The next theorem gives two new formulas for computing ||7'|| when T is self-adjoint. 


5.23. NORMAL AND SELF-ADJOINT OPERATOR 371 


5.23.8 THEOREM. Let T: H— H be a bounded linear self-adjoint operator on a 
Hilbert space H. Then \|T || is given by 


|Z" = sup{|(7x,x)]: [xl] = 1}, 


or 
|7"|| = sup{l(7x,y)|: xl] = Ilyll = 1}. 
Proof: Weshall prove the first statement and leave the second as an exercise. 
Let 
a = sup{|(7x,x)|: [|x|] = 1}. 
Since 


\(Tx,x)| < IT || xl’, 


for all x € H, it follows that « < ||T||. 

Let us now show that ||7'|| < «. For this we let B > 0, then using the fact that 
(Tx,Tx) = (T?x,x) we get 

4\|Tx||* = (T(Bx + B~*Tx), Bx + B-*Tx) 
— (T(Bx — B~*Tx), Bx — B~*Tx). 
From the definition of « we get 
A\|Tx||* < allBx + B-'Tx||? + o|| Bx — BoATx||? (5.23.1) 
< 20(B?|\x||? + B~?||Tx|l*), 


where the last step is an application of the Parallelogram Law. If ||Tx|| # 0, we set 
p-? = ||x||/|Tx|| and (5.23.1) becomes 


|Tx||? < ol] TI] [lx], 
that is, 
|x|] < al|x]]. (5.23.2) 
If ||7x|| = 0, then (5.23.2) is obviously true; therefore, ||T|| <a. J 


As the next theorem shows, a projection is orthogonal if and only if it is self- 
adjoint. 


5.23.9 THEOREM. A continuous projection P on a Hilbert space H is orthogonal 
ifand only if it is self-adjoint. 


Proof: First suppose that P is orthogonal. It follows that if x is any point 
in H, there exists a unique re @(P) and ne VW (P) such that x =r+nandrtn. 
Then (x,Px) =(r +n, P(r +n)) =(r + 4n,r) =(r,r) which is real for all x. Hence, 
from Theorem 5.23.6, P is self-adjoint. 


372 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Now suppose that P is a self-adjoint projection on H. Again for any x € H we 
can uniquely write x =r-+n. We want to show that r Ln. But (7,n) = (Pr,n) = 
(r,P*n) = (r,Pn) = (r,0) =0. Hence &(P) L W(P), so P is an orthogonal projec- 
tion. § 


EXAMPLE 5. Let M be a closed linear subspace in a Hilbert space H, and let 
P, and P, be the orthogonal projection of H onto M and M‘-, respectively. Then 
T =— At P, + AP, (5.23.3) 


is self-adjoint if and only if A, and A, are real. This is an especially important ex- 
ample, for it is a major fact of linear analysis that (5.23.3) is the basic model for al/ 
self-adjoint operators. This will be made amply clear in the next chapter. J 


EXAMPLE 6. We say that a mapping Y of L,(— 00,00) into itself is passive if 
t 
Re| | x(t) ¥x)(t) axl > 0, 


for all xe L,(—00,00) and all te R, see Youla, Castriota and Carlin [1]. The 
motivation for this definition comes from network theory. Now, we will suppose 
we have a network as shown in Figure 5.23.2. Here v denotes the voltage across tho 


Network 


Figure 5.23.2. 


terminals, and x denotes the current flowing through them. We assume that tho 
relation between v and x can be modeled by an equation x = Yv, where Y ts a 
linear mapping of L,(— 00,00) into itself. This network is passive if the net energy 
supplied is positive at all times, that is, if 


Re| i B@)x(0) axl = Re| { moto dt\ > 0, 


for all applied voltages v, resulting currents x, and times ¢. 

It is an interesting fact that if Y is passive, it is causal. We will demonstrate 
this fact here. For each time T let X, denote the linear subspace of L,(— 0,1) 
made up of all functions x that vanish to the left of JT. Since Y is linear, it follows 
(Example 4, Section 4.3) that Y is causal if and only ifeach subspace X7 is invariant 
under Y. 

Let P; denote the orthogonal projection of L,(— 00,00) onto X;+. Then we 
can restate the passivity of Y by 


Re(P,v, Yv) = 0, 


for all ve L,(— 0,0), and all Te (— 0,00). 


5.23. NORMAL AND SELF-ADJOINT OPERATOR 373 


Let T be fixed, and let v, be any point in X; and v, be any point in L,(— 0,0). 
If v = av, + v,, where « is a scalar, then Ppv = Pv, and Yu =aYv, + Yv,, so 


Re(P; v, Yv) = Re(P;v,,0Yv,) + Re(Py v2, Yv2) = 0. (5.23.4) 
Since a is arbitrary we see that the only way (5.23.4) can hold is if 
(P;v,,Yv,) = 09. 
Since P, 1s self-adjoint, we have 
(v,,P,Yv,) = 0. 
But v, is arbitrary, so 
P,Yv, = 0. 


[In other words, Yu, € X;, or X, is invariant under Y. Since T was arbitrary, it 
follows that Y is causal. J 


Normal Operators 


Again, normal operators are those operators that commute with their adjoint, 
and the self-adjoint operators are a subset of the normal ones. Let us begin by 
considering an example of a normal operator. 


EXAMPLE 7. Let T=/,P; +4,P,, where P, and P, are the orthogonal 
projections onto M and M-, as in Example 5. 
Since T7* = 4,P* + 1, P* =21,P, + 1,P,, we have 


T*T = (AyPy + 22 P2)(AyPy + Ap P2) = |Al? Py + |A2l?P2 
and 
TT * = (AyP, + Az P2)(AyP, + Az Po) = |Ay?P, + |A2l*Po, 


so T is normal for any complex numbers A, and A, . This is an especially important 
example, for it is another major fact of linear analysis that T represents the basic 
model for all normal operators. This also will be made amply clear in the next 
chapter. J 


EXAMPLE 8. In this example we show that continuous time-invariant linear 
Operators are normal. Let 7 be a continuous time-invariant linear mapping of 
H = L,(— ©,00) into itself. We know (Example 10, Section 5.22) that T* is time- 
invariant. But we recall (Example 5, Section 4.10) that continuous time-invariant 
linear operators on L,(—00,00) commute; therefore, 7T7* =T*T and T is 
normal. jf 


The following theorem is a convenient characterization of normal operators. 


5.23.10 THEOREM. A bounded linear operator L ona Hilbert space H is normal 
if and only if ||L*x\| = ||Lx|| for every x € H. 


374 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Proof: First assume that L is normal. Then one has (LL*x,x) = (L*Lx,x), 
for all x in H, which implies that (L*x,L*x) = (Lx,Lx) for all x in H. In other 
words, ||L*x|| = ||Lx||. 

Now assume that ||L*x|| = ||Lx||, for all x in H. By reversing the reasoning 
above, one gets 
((LL* — L*L)x,x) = 0 


for all x in H. We need the next lemma to conclude that LL* — L*L = 0. 


5.23.11 Lemma. Let M be a linear operator on a complex inner product space 
X. If (Mx,x) = 0, for all x in X, then M = 0. 


Proof: Since (Mx, x) = 0 for all x in X, we note that for every x and yin X 
and any scalars a and f one has 
0 = (M(ax + By), (ax + By)) — \ol?(Mx,x) — |BI?(My,y) 
= «B(Mx,y) + &B(My,x). 


If we choose « = B = 1, this fact becomes 

(Mx,y) + (My,x) = 9, 
and for a =i, B = 1 one gets 

i(Mx,y) — i(My,x) = 0. 


It follows that (Mx,y) = 0, for all x and y, which implies that M = 0. This com- 
pletes the proof of the lemma and Theorem 5.23.10. J 


In Theorem 5.23.4 we indicated how the class of self-adjoint operators [its 
into the linear space of bounded linear operators B/tLH,H]. We can say tho 
following in the case of the class of normal operators. 


5.23.12 THEOREM. The class of all normal operators on a Hilbert space H iva 
closed subset of BltLH,H]; moreover, it is closed under scalar multiplication. 


Proof: Let {L,} be a sequence of normal operators with |Z, — L|| > 0, where 
L is a bounded linear operator on H. We want to show that L is normal. But 
|LL* — L*L|| s ||LL* —L,L,*|| + LL," — Ly*Lqll + Ln*L, — L*L|| 
< |LL* —L,L,*|| + ||Z,*L, — L*L| 
< |\(L —L,)(L* — L,*) + (© — ££," + L,(L* — L,*)| 
+ |\(L* — L,*)(L — L,) + £,*(L — £,) + (L* — £,*) Ll 
<2 —-2L,| |2* — 2," 1 + IL — Lal Ln] + [Lal] WL* — 2," 
+ ||L* —L,*|| IL — Lyi] + L,* I IL — Zell + LL* — £y* ll Lal. 
It follows from Exercise 12, Section 5.22 that the right-hand side of the inequality 


converges to zero aS n— 00, So ||LL* — L*L|| = 0 showing that L is normal. ‘The 
fact that L is normal implies that «Z is normal for all scalars is trivial. J 


5.23. NORMAL AND SELF-ADJOINT OPERATOR 375 


If A and B are normal, contrary to the self-adjoint case, it does not follow that 
A + Bis normal, nor, of course, does it follow that AB is normal. Not too surpris- 
ingly, commutivity is important. 


5.23.13 THEOREM. If A and B are normal operators ona Hilbert space H such 
that one commutes with the adjoint of the other, then A + B, AB, and BA are normal. 


We see that the proof of this theorem is straightforward as soon as one notes 
that AB* = B*¥A< BA* = A*B. 
The next theorem is a sometimes more useful one whose proof is not trivial. 


5.23.14 THEOREM. If A and B are normal operators on a Hilbert space H such 
that AB = BA, then A + B, AB, and BA are normal. 


The proof of this theorem is available in Fuglede [1]. 
In general, if T is a bounded linear transformation, we have ||T?|| < ||7'||7. In 
the case of normal operators, we always get the equality. 


5.23.15 THEOREM. Jf Lis anormal operator ona Hilbert space H, then 


|Z? || = ILI. 


Proof: It follows from Theorem 5.23.10 that ||L?x|| = ||L*Lx||, for all x e H. 
So ||L?|| = ||L*L\|. But from Corollary 5.22.3 we know that ||L*Z|| = |Z|/*. jj 


EXERCISES 


1. By referring to the examples in Section 22, derive some examples of 
(a) self-adjoint operators, 
(b) normal operators. 


2. In the notation of Exercise 8, Section 12, show that ||q|| = ||Q|], when q is a 
bounded, symmetric sesquilinear functional on a complex inner product space 
X. [Hint: Study the proof of Theorem 5.23.8.] 


3. (Cayley Transform.) Let A be a self-adjoint operator on a Hilbert space H and 
define U by 
_A-il 
Ati 
under the assumption that A + iJ is invertible. 


(a) Show that U is a unitary operator. 
(b) Show that 


376 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


4. 


12. 


13. 


Let H = L,[0,1]. Define K: H > Has follows: For each x in L,[0,1] let y = Kx 
be the solution y(t) of 

y +ay +ay=x 
that satisfies y(0) = 0, y’(0) = 0. Show that K can be represented by an integral 
operator. Find K*. Find conditions on the coefficients a, and a, in order that K 
be self-adjoint. 


. Let J be any interval and let f be a complex-valued bounded function on /, 


Let H be the complex space L,(/) with the usual inner product. Define 
F: H— H by 
Fx(t) = f(t)x(t). 


Show that F is a bounded linear normal operator on H. 


. Let L be a bounded linear operator that is unitarily equivalent to multiplication 


by a bounded transfer function. Show that L is normal. 


. Define A: C*? > C” by y = Ax or 


Vi — a b x; 
yo) \e d)\x,)’ 
where a, b, c, d are complex coefficients. 
(a) Find necessary and sufficient conditions on the coefficients in order that A 
be normal, self-adjoint, or unitary. 


(b) Assume that A is normal. Find a polar decomposition of A. (See Exercise 
16.) 


. Let g(x,y) be a bounded sesquilinear functional on a Hilbert space H. 


(a) Show that g(x,y) = (Lx,y) for some bounded linear operator L. [Hint: Use 
the Riesz Representation Theorem to get g(x,y) = (x,y*) = (x,L*y) and 
show that L* is a bounded linear operator.] 

(b) Show that q is symmetric if and only if L is self-adjoint. 


. Where are the invertible operators located in Figure 5.23.1? The operators with 


bounded inverse? Where are the positive and strictly positive operators located 
in Figure 5.23.1? 


. Prove Theorem 5.23.5. 
. Let A=(q,;) be a 2 x 2 complex matrix operator. Give conditions on the 


entries a,;;in order that A be normal. 


Show that Lemma 5.23.11 fails in a real inner product space. [Hint: Consider an 
antisymmetric matrix on R”.] 


If T = A + iB, where A and B are self-adjoint operators on a Hilbert space //, 

then this is said to be the Cartesian decomposition of T. 

(a) Show that every bounded linear operator on H has a Cartesian decomponi- 
tion. [Hint:2A = T+ 7*, 2iB =T-T*.] 

(b) Show that the Cartesian decomposition 1s unique. 

(c) Compute T* in terms of A and B., 

(d) Show that T is normal if and only if A and B commute. 


14, 


16. 


17. 


18. 


5.23. NORMAL AND SELF-ADJOINT OPERATOR 377 


Let L be a normal operator on a Hilbert space H and let L = A + iB be the 
Cartesian decomposition of L. Show that 


max{||A|l’, ||Bll*} < ILI? < |All? + BI. 


. Let L be a positive self-adjoint operator on a Hilbert space H. This exercise will 


lead to a proof of the fact that L has a positive square root; that is, there is a 
positive self-adjoint operator R that satisfies R* =L. (Compare this with 
Exercise 7, Section 3.6) 
(a) First assume that L < J and set M=J—L, and R=J-—S. Then R* =L. 
becomes S = 4(M + S”). Let Sy = 0, S, = 27'M, 
Si41 = 4(M + S,”), WALD. 2253 


Show that S, and S, — S,_, are polynomials in M with nonnegative, real 
coefficients. 
(b) Show that S, > 0 and S, — S,_, => 0 for all n. 
(c) Show that ||S,|| <1 for all n. 
(d) Show that for each x € H the sequence {S, x} converges. Let Sx = lim S, x. 
(ec) Show that this operator S satisfies S = 4(M + S?). 
(f) Drop the restriction that L < I. 


Let T be a normal operator and assume that T= RU = UR, where R is a 

positive self-adjoint operator and U is a unitary operator. Then RU, or UR, 

is said to be the polar decomposition of T. This exercise will lead to a proof that 

every normal operator T has a polar decomposition. 

(a) Let R be the positive square root of T7* = T*T. 

(b) If y = Rx, let Uy = Tx. This defines U on the range of R, say U: A(R) > H. 
Show that ||Uy|| = |lyI. 

(c) Show that U can be extended to all of H so that the extension is a unitary 
mapping. 

(d) Show that T= RU. 

(e) Show that R and U commute. 

(f) Show that the polar decomposition is unique. 


(Continuation of Exercise 16.) Let T be any bounded linear operator on a 

Hilbert space H. 

(a) Show that there are positive self-adjoint operators R, and R, and unitary 
operators U, and U, so that T= R,U, = U,R,. 

(b) Show that R,, U, and R,, U, are unique. [Note: There is an analog between 
operators on a Hilbert space H and complex numbers that is often useful. 
The basic idea is to view the operation of taking the adjoint as analogous 
to that of taking the complex conjugate. Then the self-adjoint operators are 
analogous to the real numbers, for A = A*. Positive operators are analogous 
to nonnegative real numbers and unitary transformations are analogous to 
numbers of unit magnitude, for U*U = UU* = J. Just as any complex 
number, any operator T has a Cartesian and polar decomposition.} 


Show that if a unitary operator U: H — H is positive, then U = J, the identity 
transformation. 


378 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


19. 


20. 
2i. 


22, 


23. 


24. 


25. 


If T: H > H is any bounded linear operator on the Hilbert space H, show that 

T*T and TT* are self-adjoint. 

Show that if T is normal, then AJ — T is normal for all complex numbers 4. 

Let L = , lim L, and L, be bounded linear operators on a Hilbert space H. 

(a) Show that if L,, is self-adjoint for all n, then L is self-adjoint. [Hint: Study 
Theorem 5.23.4.] 

(b) Show that if L, is normal for all n, then Lis normal. [Hint: Study Theorem 
5.23.12.] 

Do the relationships A < B or A < B define a partial ordering (refer to Appen- 

dix C) on the collection of all self-adjointed operators on a Hilbert space H? 

Is it ever a total ordering? 

Let € be a random variable with range in a Hilbert space H and such that 

E(\\é||7) < co. Define the covariance operator A by 


E((x,€)(y,€)) = (Ax,y); 
where x, y € H. Show that A is a bounded, positive, self-adjoint linear operator. 
Calculate || A||, where 


(a) A=|% | 


ad f 
(b) A= c b | 
fee 


when the entries are all real. 
Let A be a self-adjoint operator on a Hilbert space H, and let 
, 0 (iA)" 
U=e4= ¥ ——. 
Pe n! 
(a) Show that U is a unitary operator. 
(b) Show that U" = e'"4 for every integer n. 


26.1? Let A be a bounded self-adjoint operator on a Hilbert space H and let 


0 (itA)’ 
Uae = ss, c y 


n=O n! 


(a) Show that for each real number f, U, is a unitary operator. 
(b) Show that U, U, = U,,,. 
(c) Show that the mapping t > U, is continuous, where the space of operatotn 
has the usual operator norm. 
(d) Discuss the meaning of the equality 
dU, . U,-I 
— = lim 
dt |t=0 ito 
in terms of topologies on the space of operators. (See Section 8.) 


=iA 


'9Tn this exercise, one constructs a continuous group U, of unitary operators in terms of a given 
self-adjoint operator A. It is possible to turn this around, that is, given the continuous group U/, ol 
unitary operators one can construct the “infinitesimal generator” A by means of dU,/dt iAU,, wind 
show that U, = e''4, see Dunford and Schwartz [1]. 


5.24. COMPACT OPERATORS 379 


27. (Scattering operators.) Let A and B be two bounded linear self-adjoint opera- 
tors on a separable Hilbert space H. Define U, and V, by U, = e "4 and 
V,=e ®. Assume that the limits 

lim V,*U,x =x_ =Q_x 
t— +00 
and 
lim V,*U,x =x, =Q,x 
t-—o 
exist foreach x e H. Let R, denote the range of Q.. and assume that R, = R_. 
The scattering operators are defined by 


S=Q2705 
and 
(Compare with Jauch [1].) The object here is to show that S is a unitary 
operator. 


(a) Show that ||Q4 x|| = ||x|| and that |QO,*x|| = ||x|| for all xe H. (Explain 
why this fact alone does not show that S and 7 are unitary.) 

(b) Show that ON,*0, = Jand Q_*Q_ =I. 

(c) Show that Q,0,* =Q_Q_* =P, where P is the orthogonal projection 
onto R, = R_. 

(d) Show that SS* = S*S = J. 

(e) Show that 77T* = T*T = P. 

28. Let y = Kx be a positive self-adjoint operator on L,[a,b] that is given by 

y(t) = [2 k(t,t)x(t) dt, where k(t,t) is real-valued and continuous. 

(a) Show that k(t,t) > 0 fora<t<b. 

(b) Show that the converse need not be true. That is, construct a kernel 
k(t,t) that satisfies k(t,t) >0 for a<t<b such that the corresponding 
operator K is self-adjoint but not positive. 


24. COMPACT OPERATORS 


The compact operators form another important class of linear operators. As 
we shall see below they are operators with finite- or, in a meaningful sense, almost 
finite-dimensional ranges. They are neither included in nor include the class of 
normal operators or, for that matter, the class of self-adjoint operators. The situa- 
tion (for infinite-dimensional spaces) is illustrated in Figure 5.24.1. As we shall see 
in the next chapter, operators that are both normal and compact yield about the 
closest thing to a finite-dimensional structure that one can have on an infinite- 
dimensional space. 

Since the elementary properties of compact operators are not dependent on the 
presence of an inner product, we shall abandon Hilbert space structure for this 
section and return to Banach spaces. ?° 


20 Compact operators can be defined on normed linear spaces, but many results require complete- 
ness, SO we just assume it at the outset. 


380 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Continuous 
Linear 
Operators 


Normal 
Operators 


Compact 
Operators 


Self-Adjoint 
Operators 


fe 


Figure 5.24.1. 


5.24.1 DEFINITION. Let X and Y be two Banach spaces and let L: X > Ybeu 
linear transformation. L is said to be compact?! if L(D) lies in a compact subset of 
Y, where D= {xe X: |x|] < 1}. 


The following theorem states that a compact operator is continuous; however, 
there are plenty of continuous operators that are not compact. For example, con 
sider the identity mapping J on any infinite-dimensional space. 


5.24.2 THEOREM. Let L: X > Y be a compact linear transformation of « 
Banach space X into a Banach space Y. Then L is continuous. 


Proof: The validity of this theorem follows almost immediately from Defint 
tion 5.24.1. Since L(D) lies in a compact set, it is totally bounded, hence bounded 
(Lemma 3.16.3). That is, there exists an M < o0 such that sup{||x||: xe L(D)} < MM. 
It follows that ||Lx|| < M||x|| for all x e X; therefore, L is bounded (that 1s, con 
tinuous). f 


An important example of a compact operator is described in the next theorem 


5.24.3 THEOREM. Let L: X > Y be a linear operator where the range ‘A(1) 1 
finite-dimensional. Then L is compact. 


Proof: By Theorem 5.10.7 we see that the unit ball in @(L) is compict 
Therefore a ball of any radius is compact, from which it 1s easily seen that / In 
compact (Compare with Example 3, Section 10.) J 


As we said, any compact operator comes close to having a finite-dimenstonal 
range. 


21 Some authors use the phrase “completely continuous” instead. 


5.24. COMPACT OPERATORS 381 


5.24.4 THEOREM. Let L: X > Y be a compact linear transformation, where 
X and Y are Banach spaces. Then given any & > 0, there exists a finite-dimensional 
subspace M of B(L) such that 


inf{||Lx — m||: me M} < ellx]. 


In other words, the finite-dimensional subspace M comes within « (in the above 
sense) of being the range of L. Presumably, the smaller ¢ is, the larger the dimension 
of M must be. 


Proof: \Lete> 0 be given. Since L(D) is contained in a compact set, where D 
is the closed unit ball in XY, there is an e-net in AL) m L(D). Let M be the linear 
subspace of Y generated by this e-net. It follows that M is finite dimensional. 
Moreover, dist(Lz,M) < « for all ze D. Then if x is any point in X it follows that 


int | 


inf{||Lx — m'||: m' e M} < ellx||, 


x 
L—- -—m 


|| 


-meM| <e 


SO 


where m’ = ||x||m. J 


The following theorem presents a number of equivalent formulations for com- 
pactness of an operator. 


5.24.5 THEOREM. Let L: X > Y be a linear operator, where X and Y are 
Banach spaces. Then the following statements are equivalent: 

(a) L is compact. 

(b) If B is any bounded set in X, then L(B) lies in a compact subset of Y. 

(c) If Bis any bounded set in X, then L(B) lies in a sequentially compact subset 
of Y. 

(d) If {x,} is any bounded sequence in X, then {Lx,} contains a convergent 
subsequence in Y. 


(e) If B is any bounded set in X, then L(B) is a totally bounded set in Y. 


Proof: Since the equivalence of (b), (c), (d), and (e) follows from the charac- 
terization of compactness in Section 3.17, we shall prove only that (a) <>(b). 
It is obvious that (b) = (a). Let us show that (a) => (b). Let B be any bounded 
set in X. Then there ts a real number k > 0 such that 


|x — Ol] = |x| <k, for all xe B. 
Let D= {xe X: |x|] < 1}. Then Bc kD, where 
kD= {kxeX: |x| < 1} = {xe X: |x|] < k}. 


Since L(B) c L(k D) = kL(D) and since L(k D) = kL(D), it follows that L(kD) lies 
in a compact set in Y. Hence, (8) lics ina compact setin Y. fj 


382 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 
Let us now consider some examples of compact and noncompact operators. 
EXAMPLE |. Let ¢,,...,0,,W1,.--5 W, be elements of L,(/) and let 


ks) = Yb (OW, 


where the «,’s are scalars. Define y = Kx by 


y(t) = J ks)x(9) ds. 


Since every point y in &(K) is given by 


y) = ¥ Bi did 


where B; = «; J; w(s)x(s) ds, we see that #(K) has dimension less than or equal to n. 
Hence K is compact. J 


EXAMPLE 2. Every linear operator defined on a finite-dimensional normed 
linear space is compact. J 


EXAMPLE 3. Consider the multiplication operator 


F. x(t) > f(t)x(0) 


on L,(/), where f is a bounded measurable function. We have seen elsewhere 
(Example 2, Section 7) that F is a bounded linear operator and ||F'|| < || /||,,. We 
will now show that F is compact if and only if f(t) = 0 almost everywhere, that 1s, 
Flo = Fil = 0. 

It is clear that || /||,, =0 implies that F is the zero operator and, therclore, 
compact. Going the other way now, assume on the contrary that || /||_, 4 0. 

If fis continuous, then there are positive numbers oa, B such that 


AMOI Soa, ted, (5.24.1) 


where J is an interval of length f. 
If xe L,(/) and x(t) = 0 for téJ one has 


Fxla? = [FOP (Ol at = | FOP LAO? at 


2 
> a |[xl]2°. 


Now choose an orthonormal sequence {x,} in L,(/) such that x,(t) = 0 for ¢¢./. 
(Why can we do this?) One then has ||x, — x,,| =,/ 2 forn # m and 


||Fx, — Fx,|l >./2a, n#zm 


by the above. Hence {Fx,} cannot contain a convergent subsequence. Therefore, / 
is not compact. 


5.24. COMPACT OPERATORS 383 


If fis not continuous, then for some integer 7 the set 


A,={t Lfol =| 


has positive measure. For this n, let « = 1/nand let J be a subset of A, with measure 
fB > 0. One can then repeat the above argument and show that Fis not compact. J 


EXAMPLE 4. Let H =/,, and let K denote the linear transformation of H into 
itself defined by 
Vn = %yXn> n=1,2,3,..., 


where y = Kx, x = {X,,X2,X3,.--}, y = {V1,V2,¥3>---}, and the «,’s are scalars. 
We claim that K is compact if and only if the «,’s satisfy the condition 


lim |a,| = 0. (5.24.2) 


no 


First assume that K is compact and that |a,| >e>0 for all n. Then, let 


em = {01m 02m 03m as: i ) 
where 0;, is the Kronecker function. Then 


Ke,, = (41,01m> %2 Ooms: +) = (0,0,... 5% 50,5. - +)s 
and form #n 


Kem — Keqll* = lol? + lotg|? > 2e?. 


Hence, {Ke,,} does not contain any subsequence that is convergent, and we contra- 
dict the fact that K is compact. 

If K is compact and (5.24.2) fails, then there is an ¢ > 0 and a subsequence 
{a,,} with |«,,| =. By using the sequence {e,,} and the above argument we arrive 
at a similar contradiction. Hence, K compact implies that (5.24.2) holds. 

On the other hand, assume that (5.24.2) holds, and then let _A = K(D), where 
D = {x: ||x|| < 1}. We shall now show that A has compact closure by applying 
Exercise 1, Section 3.17. Since || Kx|| < {max |«,|}||x|], we see that A is bounded. If 
y é A, then 


~ 2 — 2 
> yall? = dY lot Xall 
n=N n=N 


fe 6) 

2 2 

2 riles p> Ixy 
N<n 1 


< max |a,|7 > 0, as N > oo. 
Nsn 


iv.) 
Hence, > |ly,||? 0 uniformly as N— 00, so K is compact. J 
n=N 


384 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


EXAMPLE 5. Let {¢,} and {w,} be orthonormal sets in the Hilbert space H = 
L,(/), and consider the linear mapping K of H into itself defined by 


Kx = ¥ [o,60 f Vacate del = [ (Yas COVA aC) a 


a \ti= 


= [ k(2)x() dt, 


where yw, is the complex conjugate of w, and the «,’s are scalars. 
We claim that K is compact if and only if 
lim |a,| = 0. (5.24.3) 
However, the proof of this assertion is a simple variation of the argument of the 
preceding example. J 


The proofs of the next two results are elementary and are outlined in the 
exercises. 


5.24.6 THEOREM. Let A: X > Y and B: X > Y be compact linear operators, 
where X and Y are Banach spaces. Then A + B is compact. 


5.24.7 THEOREM. Let A: X > X be acompact linear operator, where X is a 
Banach space. Let B: X - X be a bounded linear operator. Then AB and BA are 
compact. 


It is obvious that for every scalar «, the operator «A is compact whenever A is 
compact. 

The next theorem shows that the class of all compact operators is a closed 
subset of Bit[ X,Y]. 


5.24.8 THEOREM. Let X and Y be Banach spaces and also let L,: X -> Y, 
n=1,2,..., be a sequence of compact linear operators converging to a bounded 
linear operator L: X > Y, that is, \|L, —L|| -~0 as n- oo. Then L is a compact 
linear operator. Hence the space of compact operators forms a closed linear subspace 
of Bit, X,Y ]. 


Proof: Let {L,} be a sequence of compact linear operators such that 
|Z, — L|| ~0 as n— oo. We know that L is a bounded linear transformation. ‘I lw 
issue is to show that it is compact. We shall do this by showing that L(D) is totally 
bounded in Y [see Theorem 5.24.5(e)]. Let ¢ > 0 be given and choose WN so that 
|Z — Ly|| < ¢. This means that 


Lx — Ly x|| < |L — Lyf [xl] < e Ill 


for all x e X. Since Ly is compact, the set L,( D) is totally bounded, so it contains an 


e-net {V1,V25-++ Vn: 
It is easy to check that the set {y,,y2,...,y,} forms a 2e-net for L(D). jj 


5.24. COMPACT OPERATORS 385 
The next example is very important and should be studied carefully. 


EXAMPLE 6. Consider the space L,(/) where J is the finite interval [a,b]. Let 
y = Kx be the integral operator given by 


y(t) = | k(t,s)x(s) ds, 
I 
where the kernel k(t,s) satisfies 
| | 1k(t,s)|2 dt ds < oo. (5.24.4) 
5 Sake & 
In other words, k is an element of L,(/ x /). [Recall that (5.24.4) is satisfied if 


k(t,s) is continuous and J is compact.] It was shown in Examples 2 and 4 of Section 
18 that 


t— 
$,(0) = (b = a)" exp(2nin * ‘), ee oe 
—@Q 


forms an orthonormal basis for L,(/) and that 


Yn. m(ts5) = 9:(t) bal s) 
forms an orthonormal basis for L,(7 x J). The Fourier Series Theorem for L,(/ x J) 
then tells us that 
|Ky — k\| > 0, as N-> o, 
where 


ky>= > (Kn im) Yam: 


|m| ,|n| SN 
If we let K, denote the integral operator with kernel k,, then we have 


|K — Kyll < lk —Ayll>0, asN- oo. 


Since the dimension of &(K,,) is (2N + 1)’, it follows that K,, is compact and, as 
a consequence of Theorem 5.24.8, we see that K is compact. 
Integral operators of this form arise naturally in many applications. 


EXAMPLE 7. Theconclusion of the last example is valid even when the interval 
Tis nonfinite. The argument is exactly the same. The only difference is in terms of 
the representation for the orthonormal basis for L,(J x J). Thus, if 7 = [0,00) 
we could take 


b,(t) = — e"PL 0, 


where L,(t) is the Laguerre polynomial (see Exercise 10, Section 18) and 


VY, CES | mee ,(t )?,,( 5). 
This is discussed further in the exercises. § 


386 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


EXERCISES 


l, 


Zz 


In Theorem 5.24.5, we showed that (e) => (a). Show that this implication may 
fail if Y is not complete. 

In Example 1 it was shown that dim @(K)<=n. Is it possible to have dim 
A(K)<n? In Example 6 we state that dim 2(K,)=(2N + 1)*. How do 
these examples differ? 


. Let k(s,t) be a measurable function on J x J such that 


) |k(s,t)| dt < M, 
I 
for all s in J. Define the integral transformation K by y = Kx, where 


y(s) = | k(s,t)x(t) dt. 


Show that K is a bounded linear operator on L,(/). Now assume that / is a 
compact set and {, 1 dt < o0, and that the mapping s > k(s, -) is a continuous 
mapping of J into L,(/). Then show that K is compact. [Hint: For the last part, 
show that @(K)c C, the space of continuous functions, and then apply the 
Arzela-Ascoli Theorem.] 


. Let X be a Hilbert space with a countable orthonormal basis {e,,e,,...}, that 


1S, (€,5€m) = 59,m- Let L: X ~ X be a compact linear operator and define L.,: 
X— X by 


L, x= y (x,e;)\(Le; 5€;)e; ° 
i,j=1 


Show that L, is compact and that ||L, — L|| > 0 as n-— oo. (This shows that 
every compact linear operator on a separable Hilbert space is the limit of opera- 
tors with finite-dimensional range. This is an unresolved question for compact 
operators on a Banach space.) How does this result compare with Theorem 
5.24.4? 


. Prove Theorem 5.24.6. 
. Prove Theorem 5.24.7. [Hint: Recall that a continuous function preserves 


convergent sequences and that a bounded function preserves bounded sce- 
quences. ] 


. Consider the Hilbert space AP defined in Exercise 10, Section 17, with inner 


product 
1 oT ease 
(x,y) = lim == [_ x(y(0) at 
Let x) € AP and define y = Lx by 


ee ee 
y(t) = ats) oT _ roll — T)x(t) dt. 


5.24. COMPACT OPERATORS 387 


(a) Show that L is a bounded linear mapping of AP into AP. 
(b) Show that if x9(t) = e'* for some real number a, then L is self-adjoint and 
compact. (Compactness is not easily verified.) 


. Let y = Ax be an operator on /,[0,00) defined by 


0O 


Yn = by Qnm Xm > 


where )°? ,=0 |Gmnl” < 00. Show that A: 1, > 1, and that A is compact. 


. Let y = Ax be an operator on /,(— 0,0) defined by 


Assume that the sequence {«,,} is chosen so that A is a bounded linear operator 
with range in /,(— 0,00). Assume also that «,, is real and nonnegative for 
all m. 

(a) When is the range of A finite dimensional? 

(b) When is the operator A compact? 


. Let L: X > X be an isometric isomorphism, where X is a normed linear space. 


(a) Show that L is compact if and only if X is finite dimensional. 
(b) Is the same conclusion valid if L is only a topological isomorphism? 


. Let L: X > X be a bounded linear operator that satisfies ||Lx|| > & ||x|| for all 


x, where « > 0. Show that L is compact if and only if X is finite dimensional. 


. Let L: X > Y be acompact operator and let M be a linear subspace of XY. Show 


that the restrictions of L to M, that is, L: M— Y, is compact. 


. (Hilbert-Schmidt operators.) Let {x,} be an orthonormal basis for a Hilbert 


space H. A bounded linear operator T: H — H 1s said to be a Hilbert-Schmidt 
operator if )°, ||Tx,||7 < co. The number 


1/2 
WTI = ( Tx ) 


is called the Hilbert-Schmidt norm of T. 

(a) Show that |||7'||| does not depend on the choice of basis. 

(b) Show that ||7'|| < ||7' ||, where ||7'|| denotes the usual norm of T. 

(c) Show that the integral operators in Example 4 are Hilbert-Schmidt opera- 
tors. (For a converse statement, see Dunford and Schwartz [1, Part 2 
p. 1083].) 

(d) Show that every Hilbert-Schmidt operator is compact and is the limit (in 
the Hilbert-Schmidt norm) of a sequence of operators with finite-dimen- 
sional range. 

(e) See Dunford and Schwartz [1, Part 2, p. 1020] for a representation of the 
Hilbert-Schmidt norm in the case of a one-to-one matrix operator. 


. Where would the unitary operators fit in Figure 5.24.1? (Distinguish between 


the finite-dimensional and the infinite-dimensional cases.) 


388 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


15. Under what conditions will the operators K discussed in Examples 4 and 5 be 
Hilbert-Schmidt operators ? 


16. Show that an orthogonal projection P on a Hilbert space H is compact if and 
only if the range of P is finite dimensional. 


17. Show that if T is a continuous time-invariant mapping of L,(— 00,00) into itself, 
then T is compact if and only if 7 = 0. (Assume here that T= F 'T(iw)F, 
where ¥ is the Fourier transform.) 


18. Verify the assertion that the operator K in Example 5 is compact if and only if 
(5.24.3) holds. 


25. FOUNDATIONS OF QUANTUM MECHANICS 


One of the major triumphs of the theory of Hilbert spaces is that it affords i 
framework for discussing and analyzing the mathematical theory of quantum 
mechanics. In the 1920s W. Heisenberg and E. Schrédinger developed seemingly 
different theories to explain the quantum effect of atomic physics. The Heisenbery 
theory is based on (infinite-dimensional) matrix methods whereas the Schrédinger 
theory is based on the properties of the differential operators appearing in the wave 
equation. It soon became apparent, however, that the two theories were actually 
equivalent?’ and that this equivalence is a consequence of the theory of linear oper- 
ators on a Hilbert space. 

We do not propose to discuss these two theories here. That is adequately 
treated in a myriad of books which have appeared in the last half century. Instead 
we would like to present a concise mathematical description”* of the foundations of 
quantum mechanics. 

Let us begin with the concept of the energy of a physical system. This 1s an 
example of an “‘observable”’ in physics. In classical mechanics, the energy of a 
system is a real-valued function of the phase coordinates of the system. In quantum 
mechanics the energy of a system is identified with an appropriate self-adjoint 
operator.?* In general, the observables (such as position, velocity, momentum, and 
so on) of physics are identified with appropriate self-adjoint operators on a Hilbert 
space, specifically an infinite-dimensional separable Hilbert space. 

There are at least two ways to explain this identification. The first explanation 
and undoubtedly the simplest, is to observe that the theory of self-adjoint operatoin 
adequately describes the pertinent physical phenomena. This then reduces thw 
identification question to one of mathematical modeling. 

The second explanation occurs when one examines the “‘ propositional calculus 
of quantum mechanics.”’ Picture the collection of all yes-no experiments, or propo 


22 In fact Schrédinger, himself, was one of the first people to make this observation. 

23 The description we present here is by no means the only possible point of view. Morcovet, 
while it is possible to base this description on a physically reasonable and mathematically rigotownr 
foundation, we simply do not have the space to do that here. For more details we refer the reader 
to the excellent book by J. M. Jauch [2]. 

24 For this example the appropriate self-adjoint operator happens to be unbounded. Therefore we 
shall postpone a more detailed discussion of the energy operator until Section 7.12. 


5.25. FOUNDATIONS OF QUANTUM MECHANICS 389 


sitions, which are used to describe a physical system. For example, one might ask 
whether the energy of the system is positive. In classical mechanics this collection of 
propositions can be identified with the collection of all*” subsets of the phase 
space. In this case, an answer would be “‘ yes’ if the phase coordinates of the system 
were within a prescribed subset. 

One of the fundamental facts in the quantum theory is that the outcome of two 
yes-no experiments A and B may depend on the order in which A and B are measur- 
ed. This is not true in classical mechanics. Therefore one cannot expect to identify 
the yes-no experiments of quantum mechanics in the same manner. However, one 
can identify these experiments with the orthogonal projections on a Hilbert space. 
Let us denote this identification by a«+P, where a represents a yes-no experiment 
and P, is an orthogonal projection with range A(P,) and null space W(P,). This 
identification is further explained in Table 1. 


TABLE 1 
IDENTIFICATION OF YES-NO EXPERIMENTS WITH ORTHOGONAL PROJECTIONS 


PROPOSITIONAL HEURISTIC HILBERT SPACE 

NOTATION MEANING NOTATION 

a proposition “a”? P,= associated orthogonal projection 

b proposition “5b” P,= associated orthogonal projection 

acb a implies b P,< P,, that is P, is an extension of Pa 

a’ not a Pa =I-P. 

anb a and b P, cP, is the orthogonal projection onto 
PAA(P»)). 

aub aor b P, UP, is the orthogonal projection onto 
[Pa(W (Po) 


The next concept is that of the “state”? of a physical system. In classical 
deterministic mechanics the state 1s a point in the phase space, or a delta function. 
[In classical statistical mechanics the state is a probability function defined on the 
subsets of the phase space, or equivalently on the collection of yes-no experiments. 
Therefore, in quantum mechanics a state is defined as a probability function defined 
on the collection of yes-no experiments. 

By using the fact that the yes-no experiments have been identified with the 
orthogonal projections on a Hilbert space H, it is possible to find a representation 
for the states of a quantum-mechanical system, but first we need a few definitions. 

If L is a self-adjoint operator on a Hilbert space H, then the trace of L is 
given by 

trL =) (Lx, Xn), (5.25.1) 
n 


where {x,} 1s an orthonormal basis for H. The sum in (5.25.1) may be unbounded, 
in which case the trace is said to be + oo. In the exercises the reader is asked to show 
that this definition of tr L does not depend on the choice of basis {x,}. 


25 Strictly speaking, all ‘‘ Borel” subsets. 


390 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


Now let W: H- 4H be a linear operator. We shall say that W, is a density 
operator if W is self-adjoint, tr W = 1, and 


0 < (W?x,x) <(Wx,x), xeEH. (5.25.2) 


Inequality (5.25.2) is sometimes noted as 0 < W? < W, by Definition 5.23.7. 

It was shown by Gleason [1] that the states of a quantum-mechanical system 
can be put into one-to-one correspondence with the density operators. Moreover, if 
a state p is identified with a density operator W then the value of p on a yes-no 
experiment a is given by 


p(@) = tr WP, (= p(P,)), 


where P, is the orthogonal projection associated with a. 

A state p is said to be a pure state if the associated density operator W satisfies 
W=Ww?*. 

The proof of the following theorem is an easy exercise. 


5.25.1 THEOREM. Let p be a state and let W be the associated density operator, 
Then the following statements are equivalent: 

(a) p is a pure state, that is, W* = W. 

(b) W is an orthogonal projection with a one-dimensional range. 

(c) There is a unit vector e in H such that p(P) = (Pe,e) for every orthogonal 
projection P. 

(d) There is an orthogonal projection P that satisfies p(P) = 1. 


As a consequence of this result we see that every pure state p can be identitic«| 
with a unit vector ein H so that the formula p(P) = (Pe,e) is valid for every ortho- 
gonal projection P. 

As already noted the observables in quantum mechanics are identified with 
self-adjoint operators. It should be mentioned that the most interesting observables 
are identified with unbounded self-adjoint operators so we have to postpone 
further discussion of this until Section 7.12. These identifications are summari7ed 
in Table 2. 


TABLE 2 


ote OF QUANTUM-MECHANICAL OBJECTS WITH HILBERT SPAC' 
BJECTS 


PHYSICAL OBJECT CORRESPONDING HILBERT SPACE OBJECT 
Yes-no experiment Orthogonal projection 

State Density operator 

Pure state Unit vector 


Observable Self-adjoint operator 


5.25. FOUNDATIONS OF QUANTUM MECHANICS 391 


A final concept we shall note here is that of “‘ expected value.’’ Let A be a self- 
adjoint operator (observable) and let W be a density operator (state). Then the 
expected value of the observable A with respect to the state W, is given by 


E(A) = tr WA. (5.25.3) 
If W represents a pure state and e is a unit vector with We = e, then one has 
E(A) = (4e,e). (5.25.4) 


This notion of expected value 1s related to the probabilistic concept of mathematical 
expectations. We must, however, postpone a more detailed discussion of it until we 
examine the spectral theorem. (See Section 7.12.) 


Dynamical Equations 


We shall use the Schrédinger version of the equations of motion for a quan- 
tum-mechanical system. A discussion of other versions can be found in Jauch [2, 
Chap. 10}. 

The dynamics of the system are described in terms of a one-parameter group of 
unitary operators . 
U, Ss ee, 
where A is self-adjoint. We note that U, then satisfies 
U, U, = U4 and U,* = O35 


If p is a pure state and is represented by a unit vector @¢, in the sense of Theorem 
5.25.1, then the time evolution of @ is given by 
o, = U,b =e" "6. 
One then has 
tp it CO 
panes = }]1 = ad 
dt t AG h tth t 


1 . ; 
a lim; (e7 ‘4h = Te" 4h a —iAd¢,, 


h-0 


where the limiting behavior above is discussed in Exercise 7 below. In other words, 
the equations of motion become 


d 
dt 


where A is a self-adjoint operator. We shall return to this in Section 7.12. 


d, = —iA@,, (5.25.5) 


EXERCISES 


1. Show that the formula for trace does not depend on the basis. 
2. Show that the expected value E, with respect to a pure state p, satisfies 
E(A + B) = E(A) + E(B) 
E(AA) = AE(A). 


392 COMBINED TOPOLOGICAL AND ALGEBRAIC STRUCTURE 


3. 


10. 
11, 


12, 


Let {x,} be an orthonormal set and let {/,,} be a sequence of real numbers that 
satisfy 1, >0 and >, A, = 1. Define p(P) by 


P(P) ad ys AWXn »PXn)s 


where P is an arbitrary orthogonal projection. 
(a) Show that p is a state by determining the associated density operator. 
(b) Show that p=)_,4,p,, where {p,} is a sequence of pure states. 


. A state p is said to be a mixture if there are two distinct states p, and p, and 


positive numbers /,, A, such that A, +A, = 1 and p =1,p, + A,p,. Show that 
every State is either a mixture or a pure state. 


. Let p be a state and define the dispersion function o(a) = p(a) — p*(a) for every 


proposition a. Let o = sup{a(a): ais a proposition} be the overall dispersion. A 
state is dispersion-free if o = 0. Show that every dispersion-free state is pure. 


. Let A = (a;;) bea self-adjoint matrix operator on the finite-dimensional Hilbert 


space C”, where C” has the usual inner product. Show that 


trA = » aij. 
i=1 
. Let A be a bounded self-adjoint operator on a Hilbert space H. Show that the 
limit 


1 
lim — (e7 4" — J) = — iA 
ho h 


exists in the norm topology on the space of bounded linear operators. [This 
proves Equation (5.25.5) for bounded operators A.] 


. Let P be an orthogonal projection in a Hilbert space H with dim &(P) =k, 


Show that tr P = k. 


. Let A be a bounded linear operator on a Hilbert space H. Show that 


tr A*A =), ||Ax,l/? 
where {x,} is an orthonormal basis in H. 
Let A be a self-adjoint operator with 0 < tr A < o. Show that ||Al| < tr A. 


Let K be a self-adjoint integral operator on L,(/) given by 
yd) = | K(4,s)x(s) ds, 
I 


where J; J; |k(t,s)|? dt ds < 00. Show that tr K = [, k(t,t) dt. 


Let A be a bounded self-adjoint operator on /,. Assume that there is a unitary 
operator U such that UAU~! =A where A = diag (A,,A,,...) is a diagonal 
matrix. Show that tr A =), A,. 


5.25. FOUNDATIONS OF QUANTUM MECHANICS 393 


SUGGESTED REFERENCES 


Akhiezer and Glazman [1] Kolmogorov and Fomin [1] 
Banach [1] Krasnosel’skii and Rutickii [1] 
Day [1] Naimark [1] 

Dunford and Schwartz [1] von Neumann [I] 

Edwards [1] Porter [1] 

Goffman and Pedrick [1] Simmons [1] 

Halmos [3] Taylor [2] 

llewitt and Stromberg [1] Wilansky [1] 


Indritz [1] Zaanen [1] 


Analysis 
of Linear 
Operators 
(Compact 
Case) 


1. Introduction 
Part A An Illustrative Example 
2. Geometric Analysis of Operators 


3. Geometric Analysis. The 
Eigenvalue-Eigenvector Problem 


4. A Finite-Dimensional Problem 


Part Bs The Spectrum 


5. The Spectrum of a Linear 
Transformation 


6. Examples of Spectra 
7. Properties of the Spectrum 


Part C Spectral Analysis 
8. Resolutions of the Identity 
9. Weighted Sums of Projections 


10. Spectral Properties of Compact, 
Normal, and Self-Adjoint Operators 


11. The Spectral Theorem 


12. Functions of Operators 
(Operational Calculus) 


13. Applications of the Spectral 
Theorem 


14. Nonnormal Operators 


396 
397 
397 


399 
401 


41] 


411 
414 
431 


439 
439 
442 


449 
459 


468 


470 
476 


1. INTRODUCTION 


In this chapter we will be concerned with the (spectral) analysis of continuous 
linear operators on a complex Hilbert space. More precisely, we will be primarily 
concerned with a special class of continuous linear operators, namely, compact 
operators. 

The first part of this chapter is devoted to an illustrative example where we 
describe this analysis. The reader will see that this analysis, the spectral analysis, 1s 
basically a geometric study of the behavior of linear operators. Our purpose in 
discussing this example early in the chapter is twofold: 


(1) to show the genesis of the eigenvalue problem for linear operators, and 
(2) to give a spectral analysis of certain finite-dimensional operators. 


Both of these aspects will play a central role in our study of operators on 
infinite-dimensional spaces. 

The remainder of the chapter is devoted to finding an appropriate generalizu- 
tion of the finite-dimensional methods to (Hilbert) spaces of arbitrary dimension, 
In order to do this it will be necessary to find an appropriate generalization o| 
the concept of an eigenvalue. This leads to the notion of the spectrum of an operator, 
which is defined in Section 5. 

There are several reasons for concentrating on compact operators at this point. 
As might be expected, one of the big divisions in spectral theory is the distinction 
between the finite- and the infinite-dimensional cases. However, within the infinite 
dimensional case itself there are also different levels of complexity. Unbounded 
linear operators call for a spectral theory of great subtlety. The spectral theory for 
continuous linear operators is less subtle but certainly not child’s play. However, 
the spectral theory for compact operators on infinite-dimensional spaces is relatively 
simple. It is a more or less enriched version of the finite-dimensional theory, and II 
offers the beginner very few unpleasant shocks. 

One of the reasons, then, that we start with compact linear operators is that 
they are easy. Another reason is that understanding of the spectral theory fur 
compact operators is a wonderful stepping stone to the understanding of thw 
spectral theory for general bounded and unbounded linear operators. Finally, and 
most important of all, a great number of linear operators that occur in practic 
are compact operators defined on Hilbert spaces. 

Actually the most powerful results we obtain will not be applicable to comput 
operators in general but to compact normal operators. Although many useful 
compact operators are also normal, it is also true that many are not. This ts jut 
a fact of life that has to be lived with. At the end of the chapter we will show one 
method of treating compact operators that are not normal. In any event, by thw 
end of this chapter the reader should begin to understand why one usually prefetn, 
when given a choice, to work with normal operators. 


396 


Part A 


An Illustrative 
Example 


2. GEOMETRIC ANALYSIS OF OPERATORS 


Let L be a linear operator defined on a complex Hilbert space H. The idea of 
ii geometric analysis of L is to break up H into a number of parts (perhaps infinitely 
many) in such a way that the operation of LZ on each part is particularly simple. 
A simple example of this was presented in the last chapter when we discussed 
orthogonal projections. That is, if P: H > H is an orthogonal projection ona Hilbert 
space H, then 


(i) P is the identity operator on &(P) and 
(ii) P is the zero operator on WV(P). 


Knowing (i) and (ii) and knowing that H = &(P) + V(P), we know what P does 
(o an arbitrary element of H. 

The decomposition of the operator P into the two parts given in (i) and (ii) 
illustrates in a minuscule manner how we propose to analyze an arbitrary linear 
operator L. 

Let us illustrate this further with a slightly more complicated operator, which 
we define in Equation (6.2.2). But first we need the following concept. 


6.2.1 DEFINITION.’ We shall say that a family of continuous projections 
{P1,.++sPm} 18 a resolution of the identity if (i) the projections are orthogonal, 
(ii) P;P; =OifiA/, and (ili) J=P, +++ + Py. 


6.2.2 LEMMA. Let {P,,...,P,,} be a resolution of the identity in a Hilbert 
space H. Then 


H= AP) +°:° + &P,,). (6.2.1) 


The proof uses mathematical induction on m. For m = 1, it is obvious, and for 
m= 2 it follows from the Projection Theorem (Theorem 5.16.4). We leave the 
details as an exercise. 

Let {P,,...,P,} be a resolution of the identity on H and let {/,,...,A,,} bea 
fumily of complex numbers. Assume that A; 4 4, for i #/, and let 

ES AP ae PALF. (6.2.2) 
Without any loss of generality we can assume that P; #0 forl <i<m. 


' Later we shall give another definition which will include this one as a special case. (See Sec- 
tion 8.) 


ss 397 


398 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


Before giving a geometric analysis of the operator L, we note that it is always 
normal. 


6.2.3 LEMMA. Let {P,,...,P,,} be a resolution of the identity. Then the linear 
operator L given by (6.2.2) is continuous and normal. Moreover, L is self-adjoint if 
and only if all the 2,s are real. 


The following theorem gives a geometric analysis of L. 


6.2.4 THEOREM. Let {P,,...,P,,} be a resolution of the identity, where P; #0 
for 1 <i<m. Then the space H can be decomposed as 


H= A&P.) +°°' + RP), 


where &(P;)  R(P;) for i # j, and the operator L = 1,P, + +++ + Am P, agrees with 
A; Ion R(P;), and any vector x € H can be expressed (uniquely) asx =X, +°'' +X, 
where x, € A(P;) and 

EX = Ax, + 00° Ag Xiy- (6.2.3) 


Proof: Lemma 6.2.2 assures us that the space H can be decomposed as indi- 
cated. Since P;P; = 0 for i #j, it follows that A(P;) < W(P,) = &(P;)* for i # |, 
Therefore, if x € A(P;), then xe W(P,) for i #j and 


Ex = (AP, +0 + dg Pin)(X) = A,P x = A, Ix. 
Hence L agrees with A; Jon &(P;). Equation (6.2.3) follows from the linearity of l.. Jj 


EXERCISES 
1. Let {P,,...,P,,} be a resolution of the identity on a Hilbert space H. Show that 


RP) = (lv) 
jz 
and 
RP;) = (\ w(P)). 
j#Fi 
2. Let {Q,,...,0,,$ be a family of nonzero orthogonal projections on a Hilbort 
space H with the property that 
H = V(R(Q4) V+ VU RO,,)). 
(a) Show that there is a resolution of the identity {P,,...,P,} such that P, (, 


and &(O;) < A(P;), i= 1,2,...,n<m. 
(b) Is it possible to have m =n? 


3. Prove Lemma 6.2.2. 
4. Prove Lemma 6.2.3. 


6.3. GEOMETRIC ANALYSIS. THE EIGENVALUE-EIGENVECTOR PROBLEM 399 


5. Let {Q,,...,0,,$ be a family of orthogonal projections on a Hilbert space H 
and set 


L=A,Q; +°°* +Am Qm- 


(a) Show that L is linear and continuous. 
(b) Is is possible for L to be nonnormal? 


6. Let {P,,...,P,,} be a resolution of the identity where m>2. Show that 
{Q,,...,0,,} is a resolution of the identity where Q, = P, + P, and QO;=P;, 
3<i<m. 


7. Let {e'?"": 1 =0, +1,...} be an orthonormal basis for L,[0,1]. Let P, be the 
orthogonal projection onto V(e'?"") and Q, the orthogonal projection onto 
Vi({ei2"™"': n < |m|}). Show that {P,: |n| <.N}U {Oy} is a resolution of the 
identity. (For example, the operator L = )\"_ _yP,, represents a low-pass filter 


with gain = | and phase shift = 0 in the passband.) 


8. Let {Q,,...,0,,$ be a resolution of the identity on a Hilbert space H. Let 
{A,,...,4,} be a partition of the set of integers {1,2,...,m}. Assume that 
A,;#@ fori=1,2,...,n. Define P; by 


P,; =— > Q; . 
JEAi 
Show that {P,,...,P,,} is a resolution of the identity. [Hint: Use mathematical 
induction. ] 


3. GEOMETRIC ANALYSIS. THE EIGENVALUE-EIGENVECTOR PROBLEM 


Theorem 6.2.4 does give a geometric interpretation of an operator L that can 
be expressed as a finite linear combination of orthogonal projections. Unfortunately, 
linear operators are generally not given in this convenient form. Therefore, two 
questions arise. 


(i) Under what conditions can a linear operator L be represented as a finite 
linear combination of orthogonal projections? 

(ii) If one can do this, how can one find the scalars 1; and the corresponding 
projections P;? 


The first question will require a little work, and we shall give a partial answer in 
the next section. The answer to the second question is, however, rather easy. 


6.3.1 THEOREM. Let L be a linear operator on a Hilbert space H and assume 
that there is a resolution of the identity {P,,...,P.,}, with P;#0 for i=1,...,m, 
and that there is an m-tuple of distinct scalars {i,,..., Am} such that 

L=4A,P, ae + A,,P 


i is 


400 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


Then the only scalars 2 for which the equation 
Ax — Lx =0 (6.3.1) 


has a nontrivial solution x are },,4,,..., Am. Moreover, if 4 = 4,, then the corres- 
ponding solution x must lie in (P;) and vice versa, that is A(P;) = N(A,1I — L). 


The A,’s are referred to as the eigenvalues of L and W(A,I — L) is called the 
eigenmanifold associated with 4,;. Solving Equation (6.3.1), or equivalently Lx = Ax, 
is sometimes referred to as the eigenvalue-eigenvector problem for the operator L. 
We shall return to these concepts in Section 5. 


Proof: We use Theorem 6.2.4. Let H = &(P,)+-:-+ &(P,,) be the decom- 
position of H. For each x in H we let x = x, +++: +-%,,, where x; € A(P;). Then 


Lx = Ayxy to8+ + A Xm 


If Lx = Ax, then, by using the fact that (A,x,;, x;) = 0, if i Aj we have 


j? 
Adx;, Xj) = A(x;, X;), 
indeed, 


Ad X55 Xj) = (Ayxy +t + An Xm» Xj) = (LX, x;) 
= (Ax, X;) = AQ Ht + Xn Xi) = ACX;, X)), 


fori=1,...,m.Since the {/,,...,A,,} are distinct, the above equality will hold only 
if A =A, or x; = 0. 

Now if 4 #4;, then it must be that x, = 0 for all 7, so then it also must be that 
x=x,+°°'+x, = 0. Hence the only solution of (6.3.1) in this case is the trivial 
solution. 

On the other hand, if A = 2,, then any nonzero vector x in A(P;) is a solution 
of (6.3.1). 

Let us now show that if 2 = 4,, then the corresponding solution x of (6.3.1) 
must lie in A(P;). [This will prove that A(P;) = VW(A;1 —L).] Since the 2,’s are 
distinct and 


O= Al -—L x tet t+ Xm) = Ai Ady te FO xt ti An) Xm 
we see that x; = 0 for j#i and x = x;, that is, xe A(P;). | 


In the next section we shall consider a finite-dimensional problem. This will 
illustrate the techniques we wish to develop later. But before we do this, let us note 
that the geometric analysis described in Theorem 6.2.4 leads naturally to the cigen 
value-eigenvector problem, Equation (6.3.1). In the next section we shall show how 
the solution of the eigenvalue-eigenvector problem (for finite-dimensional self 
adjoint operators) leads back to the geometric analysis. 

Finally, the conclusion of Theorem 6.2.4 can be reformulated another way In 
terms of eigenvalues and eigenvectors. In practice, this version seems to be mote 
useful, so we ask the reader to take particular note of It. 


6.4. A FINITE-DIMENSIONAL PROBLEM 401 


6.3.2 THEOREM. Let {P,,...,P,,} be a resolution of the identity on a complex 
Hilbert space H and let L =) 7., 2,P;, where {A,,...,Am} are distinct scalars. Then 
there exists an orthonormal basis {e,} of eigenvectors of L, that is, Le, = u,e, , where 
u,, is one of the numbers {/,,...,Am}. Moreover, every vector x € H can be written in 
the form x = >, (x,e,)e, and Lx = Vy My( XE nen + 


Proof: LetH = &(P,)+°°: + &,,) bethe decomposition of H generated by 
the resolution of the identity. Let A; be an orthogonal basis for A(P;), 1 <i<m. It 
follows from Theorem 6.3.1 that each vector in A; is an eigenvector for L. Thus, let 
A=A,U°::UA,,. We claim A is a basis for H. However, this follows from 
Lemma 6.2.2 and the Orthogonal Structure Theorem. The rest of the theorem now 
follows from the Fourier Series Theorem (Theorem 5.17.8) and the fact that L is 
linear and continuous. J 


EXERCISES 
|. What happens to Theorem 6.3.1 if the scalars {/,,...,4,,} are not distinct? What 
happens if P; = 0 for some i? 


2. Let {e,} be an orthonormal set in a Hilbert space H and let {p,,...,u,,} be a 
collection of scalars. Define L: H > H by 


Lx = by Un(X,€ nen : 
n=1 


(a) Show that L is a bounded linear operator. 
(b) Show that Z is normal. 
(c) Show that 


|L\| = max{|y,|: 2 = I,...,m}. 


1. Let {e,} be an orthonormal set in a Hilbert space H and let {u,} be a collection 
of scalars from a bounded set D. Define L: H > H by 


Lx = » Ul X,e nen - 


(a) Show that Z is a bounded linear operator. 
(b) What is L*? 

(c) Show that Z is normal. 

(d) Show that 


IL] = sup |x,I. 


‘1, Let ZL be a self-adjoint operator on a Hilbert space H. Show that if LZ is an 
orthogonal projection, then the only eigenvalues of L are 0 and/or 1. 


A. A FINITE-DIMENSIONAL PROBLEM 


Let L: H> H be a bounded linear operator on a complex Hilbert space H. 
Assume that L is self-adjoint and that the range of L is finite dimensional. We will 
iow show that there is a resolution of the identity {Po,....,P,,} on H and a family 


402 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


of distinct scalars {Ap,...,4,,} such that L = )7.,4,;P;. By then using Theorem 
6.2.4 or Theorem 6.3.2 one arrives at a geometric analysis of L. 

First we note that the range @(L) is invariant under L. Since L is self-adjoint, 
one has WV(L) = &(L)’. Indeed if x e W(L) and y € H, then we see that x L Ly 
since (x,Ly) = (Lx,y) = (0,y) = 0. Let Py be the orthogonal projection onto V(L) 
and let Ap = 0. We note that L is a one-to-one mapping of &(L) onto A(L). Let 
{A,,....A,} denote the nonzero eigenvalues of L and assume that the /,’s are distinct. 
Let R; denote the set 


R, = {xe AL): Lx =1,x,1<i<m.} 


Each R;, is then a finite-dimensional linear subspace of @(L) and, by Theorem 
5.10.3, it is a closed linear subspace of @(L) and hence of H. We let P;, 1 <i<m, 
be the orthogonal projection onto R;. We claim that {P),P),...,P,,} iS a resolution 
of the identity on H and that L =) 7, 1;P,. 

First we observe that each P; is an orthogonal projection, by construction, 
Next we claim that P;P,; = 0 if i #7. In order to prove this we need the following 
result. 


6.4.1 Lemma. Let L: H—> H be aself-adjoint operator. Then the eigenvalues of 
L are real. 


Proof: Let A be an eigenvalue of L and let x be a nonzero vector that satisfien 
Lx = Ax. One then has 


0= (Lx,x) a (x,Lx) — (Ax,x) _ (x,Ax) = (A ~~ A\(X,X), 
which implies thatdA=1. J 


We now note that P; P; = 0 for? # jis equivalent to saying that R; 1 R, fori # /. 
To show this, we let 4; and A, be distinct eigenvalues and choose nonzero eigen- 
vectors x and y in R; and R,, respectively. One then has 


0= (Lx,y) _ (x,Ly) = (A; x,y) ~~ (x,A;¥) ace (A; — Aj )(xy), 


which implies that (x,y) = 0. 

Let O=P,)+°:':'+F,,. It is easy to see that Q is an orthogonal projection 
(compare with Exercise 3, Section 5.16). To show that {Pp,...,P,,} 1s a resolution 
of the identity, we want to show that O = J, or equivalently that M = H, where 
M = &(Q). Since L(M) < M, it follows (Theorem 5.22.4) thatL: M+ —> M?. Since 
M*c &(L), it follows that L is a one-to-one mapping of M+ onto itself. We want 
to show that M°~ is trivial, that is, M+ = {0}. This is, however, an immedinte 
consequence of the following result. 


6.4.2 THEOREM. Let Y be a finite-dimensional complex Hilbert space, with 
dim Y=n2>1.LetL: Y> Y bea linear mapping of Y into Y. Then L has at leant 
one eigenvalue. 


6.4. A FINITE-DIMENSIONAL PROBLEM 403 


(In applying this theorem to our problem, we merely observe that the mapping 
L: M+—M? has no eigenvalues. It follows then that dim M+ = 0.) 


Proof: Let {e,,...,e,} be a basis for Y. Any vector x in Y can be written as 
x =) 7, x,e;. If we identify x with the n-tuple of complex numbers (x,,...,x,) 
and y with (),,...,y,), one can then express the relationship y = Lx by 


yi; = dL hixs> (6.4.1) 
j= 
where Le, = )7_, /;;e;. In matrix notation, (6.4.1) can be written as 


[y] = (Z]L+], 


where [y] and [x] are column vectors with entries (j;,...,y,) and (x,,...,x,) 
respectively and [L] is the n x n matrix with entries (/,,). The eigenvalue problem 
Lx — Ax = 0 then becomes 


hy-A |, oe day xy 0 
f 13 =e eee l,. e an : (6.4.2) 
ie L.2 Lin —A X, 0 


or 

Y (li; — 46,,)x; = 0, ba, eg hi: 

j=) 
By Cramer’s Rule, Equation (6.4.2) has a nontrivial solution (x,,...,x,) if and 
only if the determinant 


vanishes at A. Since p(A) is a polynomial of degree n > | it has at least one zero A. 
This zero A is then an eigenvalue of L. J 


We have thus shown that {Pp,...,P,,} forms a resolution of the identity. This 
means that H=Ro+R, +°°: +R, = RoPR, O° OR,,, where Ryo = V(L). 
That is, any vector xe H can be written uniquely as x = x9 +°'-+X,,, where 
v,E R;. 

Since 

| By eo Oe tn ae ae en) Oe tee Fe es a ae Oey ane 
= (Ap Po to0* + Am Pn)X; 


it follows that L =)’ A;P;. 
i=0 


Let us summarize what we have just proven. 


6.4.3 THEOREM. Let L: H->H be a bounded, linear self-adjoint operator ona 
Hilbert space H. Assume that the range of L is finite-dimensional. Then there is a 
resolution of the identity {Po 9,P,,...,P,} and corresponding real numbers 


mom 


[Ag Agere s Am} such that L = ig AYPi. 


404 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


We see then that Theorem 6.2.4 or Theorem 6.3.2 can be applied to the operator 
L given above. These two theorems (Theorems 6.2.4 and 6.3.2) are prototypes of the 
Spectral Theorem, which is discussed in Section 11. 

Before we turn to the study of general operators let us note here that there is 
a third way of formulating the Spectral Theorem. We shall do this for linear opera- 
tors on a finite-dimensional space. 


6.4.4 THEOREM. Let H be a finite-dimensional complex Hilbert space and let 
L: H—>H be a self-adjoint linear operator. Then there is a basis in H such that the 
matrix that represents L in this basis is a diagonal matrix. 


Proof: By the analysis above, we see that the operator L satisfies the 
hypotheses of Theorem 6.3.2. Let {e,,...,e,} be an orthonormal basis of eigen- 
vectors of L so that Le, = y,,e,, for some eigenvalue p,,. It follows then that L is 
represented by the diagonal matrix 

Py 0 
0 HU, 


There is yet another way of interpreting the last result. This may be called tho 
‘transfer function representation’”’ for L. It follows from the Parseval Equality (see 
the Fourier Series Theorem) that the mapping 


in this basis. J 


Ue (Xe seeX)s 
where x; = (x,e;) 1S a unitary mapping of H onto C”. This means that one can 
write 
L=U7'AU, or A =ULU", 


where A is the diagonal matrix A = diag(u,,...,u,). The representation A tn 
sometimes called the ‘‘ transfer function ”’ of L. 


EXAMPLE 1. Let X denote the collection of all random variables € defined on 
a probability space (Q,4,P) with range in a finite-dimensional complex Hilbett 
space H and satisfying the following conditions: 


(i) E[¢] =0, 


(ii) E[|E|?] < 0, 
(iii) E[|(x,6)|7] > 0 for all x e H with x 4 0. 


Here we use E to denote the mathematical expectation. Furthermore we note that 
if xe H and &é e X, then 


E[(x,0)] = (x,£[¢]) = (%,0) = 0. (6.4.4) 


6.4. A FINITE-DIMENSIONAL PROBLEM 405 


Now let €', €?, and €° be three random variables in XY with the same distribution 
functions, that is, 


P[E"(w) € A] = P[E*(@) € A] = P[E*(@) € A], (6.4.5) 
for all measurable sets A c H. Assume further that 
E and & are stochastically independent (6.4.6) 
and that there is a self-adjoint operator L on H that satisfies 


(Lx,x) > 0 for all x e H with x 4 0, and such that (6.4.7) 
tl + & and ve 2.Lé* have the same distribution functions, that is, 


P[(E'(w) + €2(@)) € A] = PL. /2LE3(a) € A] (6.4.8) 


for all measurable sets 4 c H. We claim that? L = J, the identity operator. 

Let us now prove that L = /. For this purpose we shall use Theorem 6.4.3 which 
assures us that L = )7.) 1, P;, where {Po,P,,...,P,,} is a resolution of the identity 
on H. Let x € &(P;) with x 4 0. Now by (6.4.7) we see that 


0 < (Lx,x) = (A; x,x) = Afx,x). 


lence 4; > 0 for i= 0, 1,...,m. 
Since €!, €*, and &° have the same distribution functions we see that 


EL Mx,€")17] = EL|(x,2?)17] = EL, 6°17]. 


l‘urthermore, since €' and &? are independent, the real-valued random variables 
Re(x,é') and Re(x,é7) are independent for each x € H. Also Im(x,é') and Im(x,é7) 
ure independent. Hence 


E[(x,€2)(x.€7) + (x,€4)(x,€7)] = 0, (6.4.9) 
since 
E[(x,€1)(x,€) + (x,2)(x,€7)] 

= 2E[Re(x,é!) - Re(x,é2) + Im(x,é4)Im(x,é7)] 

= 2{E[Re(x,¢')]E[Re(x,é2)] + E[Im(x,é1)] E[Im(x,é?)]} 

= 2{Re E[(x,¢')]Re E[(x,¢*)] + Im E[(x,€')]Im E [(x,€7)]} 

—0 
hy (6.4.4). 


Let € denote either €', €*, or €°. The relationship 


E[(x,2(»,2)] = a(x) 


defines a sesquilinear functional on H (see Exercise 8, Section 5.12). By using (ii) 


’ There is nothing mysterious about the factor V2 used above. If this did not appear in Equation 
(6.4.8), then one would show instcad that L V21. 


406 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


and the Schwarz Inequality it is easy to see that g is bounded. It follows from 
Exercise 8, Section 5.23, that there is a bounded linear operator S on H with 


q(x,y) = (Sx,y) 
which then gives 
EL |(x,2)|"] = (Sx,x). (6.4.10) 
It follows from (iii) that 
(Sx,x) > 0 for all xe H, x £0. (6.4.11) 
By applying the self-adjointness of L together with (6.4.8) and (6.4.9) we get 
ELW/2Lx,8°)7] = E(x, J2LE)/7] 
= EL \(,6' + 217] = EL1@.6") + 6")17] 
= EL |(,6")17J + EL|,67)|71. 
By using (6.4.10), the last equation can be written as 
(S,/2Lx, 21%) = (Sx,x) + (Sx,x) 
or 
(Sx,x) = (SLx,Lx). 
If we replace x by Lx, above, we get 
(Sx,x) = (SL?x,L7x). 
By repeating this step we get 
(Sx,x) = (SL"x,L"x), i (eee (6.4.12) 


Let us now show that the eigenvalues 4), 4,,..., 4,, of L must satisfy | < A, for 
all i. Indeed, if this were not true, say that A; < 1, then by choosing x € A(P;) and 
x #0 we get Lx =A, x, L*?x = A,*x, and L"x = 1;"x. Hence 


\(SL"x,L"x)| < [Sl] |L"x|]? = [S|] 2,7" [xl] + 0 


asn— oo. Thus (6.4.12) implies that (Sx,x) = 0, which contradicts (6.4.11). 

The final step is to show that the eigenvalues J, satisfy 4; < 1. We do this hy 
studying the inverse L~’. 

Since the eigenvalues 4, are positive, the inverse L~' exists and must be con 
tinuous. (Here we use Theorem 5.10.4 and the fact that L~' exists if and only I 
1 = 0 is not an eigenvalue.) Furthermore, L~' must be self-adjoint, as can casily he 
checked. Finally we note that if one applies Theorem 6.4.3 to L~', one gets 


ES — dA Ps 
i=0O 


t= 


where the /,’s and P,’s are the same quantitics used for the operator L. 


6.4. A FINITE-DIMENSIONAL PROBLEM 407 


One can replace x with L "x in (6.4.12) and thereby get 
(SxXjx) = (SL Xe). 


If we repeat the argument following Equation (6.4.12) we conclude that 1 < 1,7’ 
for all i, that is, A; < 1 for all 7. Hence 4; = 1 for alli and L = J. 
The results of this example can be extended, see Prokhorov and Fish [1]. J 


EXERCISES 


1. Let H be a complex Hilbert space with finite dimension z and let L: H- Hbea 
self-adjoint operator. Let {e,,...,e,} be any orthonormal basis in H and let 
(1;;) denote the matrix representation of L with respect to this basis. Let 
p(A) = det(/;; — Ad; ;). Show that the eigenvalues of L are precisely zeros of p(A). 
Factor p(A) as 


P(A) = (— 1)" Am A= AS 


where the roots {/,,...,A4,) are distinct, m;>1, and m, +:::+m,=n. For 

l<i<k, let R; = {xe H: Lx =/;x}. 

(a) Show that dim R; = m;. 

(b) Show that L =1,P, +-+::+4,P,, where P; is the orthogonal projection 
onto R;. 


2. Let T be the matrix operator 


on C°. Find the eigenvalues and eigenvectors for T. Determine the corresponding 
resolution of the identity for T and express the corresponding projections as 
matrix operators. 


3. Letd,,..., Ons Wis---5 W, €L2(/) and let k(s,t) = V4, 6s) Wt). Assume that 
k(s,t) = k(t,s). 
(a) Show that the integral operator y = Kx, where 


y(s) = J Ks.Dx() dt, 


is self-adjoint and has a finite-dimensional range. 

(b) Determine the eigenvalues and eigenvectors of K. 

(c) Assume that the eigenvalues are nonnegative. Do there exist functions 
{E,,...,¢,} in L3(/) such that k(s,t) = Y#_, €(s)é(t)? What happens if the 
cigenvalues are negative ? 


408 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


4. 


Let H = I, be the space of sequences x = (x,,x,,...) of complex numbers with 
°° . |x,|* < co and with the usual inner product. Let K be any n x n matrix 
and define the infinite matrix L by 


kay oc. ky, 0 
Koa eee k,,0 
K 0\ |: 
L=(5 o> key eee Ky 0 
0 ... 0 


What is L* ? Show that L = L* ifand only if K = K*. Determine the eigenvalues 
and eigenvectors for L. 


. Let L be the matrix operator 


=2.421 44+35i -2-i 

—24+2i -2-i 4+ 5i 
on C°*. Show that L is normal. Show that there are eigenvectors {e,,e2.¢,} 
and corresponding eigenvalues {/,,4,,43,} for L so {e,,e2,e3} forms an 
orthonormal basis. [Answer: 1, = 6+ 6i, e, = (0, 1/72, — 1/,/2); A, = Ol, 


es = (1/3, 1/3, 1/,/3), As = 6, 3 = (2/,/6, —1/,/6, —1/,/6).] 


( 442i —242i =~] 


6. What happens in Example | if one drops assumption (6.4.7)? 


. Let L and M be two self-adjoint matrix operators on a finite-dimensional 


Hilbert space H and assume that LM = ML. Show that there is a single unitary 
mapping U: H > H such that 


ULU~* and UMU™! 


are diagonal matrices. (That is, if LZ and M commute they have a common 
diagonalization.) What happens if L and M do not commute? 


. Let {L,,...,L,} be a family of self-adjoint matrix operators on a finite-dimon- 


sional Hilbert space H and assume that L,L,;=L,L; for i, 7=1,2,....h. 
Show that there is a single unitary mapping U: H — H such that 


UL,U~', UL,U~',..., UL,U7! 


are diagonal matrices. 


. Find the eigenvalues and a corresponding orthonormal set of eigenvectors for 


the following matrix operators acting on C”. 


off ears 
o BS 2 efi 
2 3 0 


3 The matrix in (d) is normal but not self-adjoint. However, it still has an orthonormal basis of 
eigenvectors. Does this suggest a theorem? 


10. 


6.4. A FINITE-DIMENSIONAL PROBLEM 409 


Boe g9 2 2 0 
(e) -. ; (fy L] 2 86 —40 0 
45,20 —40 95 0 
0 oOo 04 
(Eigenvalues are A = 1,2,3,4.) 
24/6 0 0 0 0 
1 | 0 6/6 72 -18 —!18 
© a6 0 72 8/6 2/6 2/6 
0 -18 2/6 17/6 —7,/6 
0 -18 2/6 -7/6 17,/6 
(Eigenvalues are 2 = 1,—1,2.) 


In each of the following exercises you are asked to show that the operator K, 
given by 


y(t) = [ k(t,t)x(2) de, 


is a Self-adjoint operator on L,[—z,zx] with finite-dimensional range. You are 
also asked to determine all the nonzero eigenvalues {/,,A,,...,4,,} and corres- 
ponding projections {P,,...,P,,} sothat K=A,P, +--+ +AnPm- 

(a) k(t,t) = 4 cos(t — T). 

(b) k(t,t) = 1 + cos(t — 7). 

(c) k(t,t) = 1 + cos(t — t) + sin 2(¢ + 7). 

(d) k(t,t) = sin 3(t — 7). 

(e) k(t,t) = )*_, {a, cos n(t — t) + b, sin n(t — 1)}. 


. Let P be an orthogonal projection on a finite-dimensional Hilbert space H. 


(a) Use the theorems of this section to give a spectral analysis of P. 
(b) Express P as a diagonal matrix operator. 


Part B 
The Spectrum 


5. THE SPECTRUM OF A LINEAR TRANSFORMATION 


In the last three sections we have considered several aspects of the eigenvalue- 
eigenvector problem for linear operators. It will be helpful to reformulate some of 
these concepts in a slightly more general context. 


6.5.1 DEFINITION. Let 7 bea linear transformation with its domain A(T) and 
range &(T) contained in a linear space X. A scalar J such that there does exists an 
xéE A(T), x #0, satisfying the equation Tx = Ax, is said to be an eigenvalue of T. 
If A is an eigenvalue of T, any nonzero x € A(T), satisfying the equation Tx = Ax, is 
said to be an eigenvector of T corresponding to the eigenvalue 4. If A is an eigen- 
value of T, the null space of the transformation AJ — T, W(AI — T), is said to be 
the eigenmanifold (eigenspace) corresponding to the eigenvalue 2. The dimension of 
the eigenmanifold is called the multiplicity of the eigenvalue A. 


Note that this definition applies to all linear operators, continuous or not. It is 
important though to emphasize that the solution x of Tx = Ax must be nonzero 
and must lie in DT). 

The reader may suspect that a knowledge of the eigenvalues and eigenvectors 
of a linear operator may be enough for a geometric analysis. Unfortunately this is 
not the case. 


EXAMPLE 1. Let H = L,(— 00,00) and define T: H > H by y = Tx, where 


t 
yw(t)=[ en 'Px(2) de. (6.5.1) 
One can show that A(T) = A and &(T) c H. Since 
1 


t 
| e” ft —S)ptos ds = : eivt 
= 1+ iw 


one might be tempted to say that x(t) = eis an eigenvector of T with correspond- 
ing eigenvalue 2 = (1 + iw)~'. However, e'' is not in L,(— 0,0), so this is not 
correct. In fact, T does not have any eigenvectors in L,(— 00,00). Nevertheless, one 
can give a geometric analysis of T by using the Fourier transform. Such an analysis 
requires knowledge of the spectrum of 7. J 


In order to motivate the definition of the spectrum, Ict us return to the concept 
ofan eigenvalue. First we observe that a scalar A is an eigenvalue for Tif and only if 


411 


412 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


the linear transformation AJ — T is not one-to-one, that is, if and only if the null 
space W(AI — T) is nontrivial. 

Of course, if (AJ — T) is not one-to-one, it is not invertible. It does not even have 
an inverse defined on its range. Now it can happen that (AJ — T) is one-to-one, that 
is, Ais not an eigenvalue, and (AJ — T) still does not have an inverse defined on X. Or, 
bringing in topological considerations, the inverse may exist but not be continuous, 
Indeed, there are a number of possibilities for (AJ — 7). It turns out that a key to the 
analysis of a linear operator T is the study of the subset of 4’s for which AJ — T fails 
to have a continuous inverse. This subset of A’s is called the spectrum of 7. Let us 
now be more precise. 


6.5.2 DEFINITION. Let T be a linear transformation whose domain A(T) and 
range &(T ) are contained in a complex Banach space YX. The set of all A such that the 
range of the transformation (AJ — T) is dense in X and such that (AJ — T) has a 
continuous inverse defined on its range Is said to be the resolvent set of Tand denoted 
by p(T). The set of all complex numbers that are not in the resolvent set is said to be 
the spectrum of T and denoted by o(T). 


Needless to say, there are several ways that a complex number / can fail to be 
in the resolvent set p(T). This fact leads us to a subdivision of the spectrum. 


(a) The point spectrum of a linear transformation T is the subset of all A'y, 
denoted by Po(T), for which the transformation (AJ — T) is not one-to-one. That in, 
the point spectrum is exactly the set of all eigenvalues. 

(b) The continuous spectrum of a linear transformation T is the subset of all 
4’s, denoted by Co(T), for which the transformation (AJ — T) has its range dense in 
X, is one-to-one, and for which the inverse defined on the range is not continuous, 

(c) The residual spectrum of a linear transformation T is the subset of all A'‘y, 
denoted by Ro(T), for which the transformation (AJ — T)is one-to-one but does not 
have its range dense in X. 


The first things to note are that Po(T), Co(T), Ro(T) are pairwise disjoint and that 
o(T) = Po(T) U Co(T) U Ro(T). 


Figure 6.5.1 may aid the reader in remembering the contents of the above definitionn. 

This may seem like a big jump from the set of all eigenvalues of an operator, but 
as the story of spectral analysis unfolds the need for each aspect of this definition 
reveals itself. Again, the definition has been stated in a more general form thin 
needed here. In particular, it has been stated for Banach spaces even though we ate 
only interested in Hilbert spaces here. Also, A(T) is not required to be all of X. ‘Vhhin 
latter situation arises, for example, when one considers, as we do in Chapter /, 
unbounded operators such as differential operators. 


6.5. THE SPECTRUM OF A LINEAR TRANSFORMATION 413 


START 


IsCQqal-T) 
one-to-one? 


Ad is in the 
Point Spectrum, 


Po(T), of T 


Is the range of 
(Al - T) dense 
in X” 


A is the 
Residual 
Spectrum, 

Ro(T), of T 


Is the inverse of 
(Al - T) defined 
on its range 
continuous? 


dA is in the 

Continuous 
Spectrum, Co(T) 
of T 


A is in the 
Resolvent Set, 
p(T), of T 


Figure 6.5.1. 


EXERCISES 


1. Let 7: X + X be a bounded linear operator on a complex Banach space X. We 
say that a complex number / belongs to the approximate point spectrum of T, 


denoted (APo(T), if there is a sequence of vectors {x,} in X with 1 < ||x,|| for 
all n and 


|GAl—T)x,|| > 0 
as n— 00. Prove the following: 
(a) Ae APo(T) if and only if (AJ —T) is not bounded below. 
(b) APo(T) < o(T). 
(c) Co(T) Uv Po(T) < APo(T). 
(d) If the residual spectrum Ro(T) is empty, then APo(7) = o(T). 


2. Examples of spectra will be discussed in Section 6. Here we consider the Volterra 
integral operator (of the first kind), y = Kx, where 


y(t) = f Kts)x) ds. 


Assume that the kernel k(¢,s) is continuous for O<¢ s5<t<1. Show that 


414 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


6. 


K: L,[0,1] > L,[0,1] and that K is bounded. Show that o(K ) = {0} and that 0 
is in the approximate point spectrum. 


. Let L: X—> X be a linear operator on X and let {/,,4,,...} be a set of distinct 


eigenvalues for L. Further, let x, be an eigenvector associated with 4,, n= 
1,2,.... Show that the set {x,,x,,...} is linearly independent. 


. Show that A = (1 + iw)7' is in the continuous spectrum of the operator 


y(t) = { e~""Dx(t) dt 


given in Example |. 


. Construct a nonzero linear operator L (other than K in Exercise 2) with the 


property that o(L) consists of the number 0 only. [Hint: Try it for L: C? > C?.] 


. Let X be a closed linear subspace in a Banach space Y and let L: Y> Y bea 


linear operator with the property that L(Y) < X. Let oy(Z) denote the spectrum 
of L: Y> Y and o,(L) the spectrum of the restriction of L to X. Show that 
ox(L) < oy(L). 


. Find the eigenvalues for y = Kx, where 


y(t) = fa — 3tt)x(t) dt. 


. Assume that k(t,t) > 0 for a < t,t < b and also let J be a nonzero eigenvalue of 


y = Kx, where 


b 
y(t) = { k(t,t)x(t) dt. 


Assume that the corresponding eigenfunction ¢(t) is positive for a<t<h. 
Show that W(AI — K) is one-dimensional. 


. Let L be a compact self-adjoint operator on a Hilbert space H and assume 


that o( L) is one of the following three sets: {0},{1},{0,1}. Show that L is an 
orthogonal projection. 


EXAMPLES OF SPECTRA 


In this section we determine the spectra for a number of widely applied linear 


operators. Although our main interest in this and the next chapter is in compact 
operators, we will not limit our attention to such operators here. This will, among 
other things, allow the reader to develop an understanding of the place of compact 
operators within the class of linear operators in general. 


EXAMPLE |.(FINITE-DIMENSIONALCASE.) Let X bea finite-dimensional Banach 


space and let 7 be a linear transformation of X into itself. Since (Theorem 4.7.7) 


dim[VYCUI — T)] + dim[A(Al — T)] = dim X, 


6.6. EXAMPLES OF SPECTRA 415 


it follows that (AJ — T) is one-to-one if and only if AAI — T) = X. Therefore the 
residual spectrum of T is empty. Further, if (AJ — 7’) is one-to-one, it has an inverse 
defined on X. Since any linear transformation defined on a finite-dimensional space 
is continuous (Theorem 5.10.4) (AJ — T)~* is continuous and the continuous spec- 
trum of 7 is empty. Hence, we have shown that for the finite-dimensional case, T 
has a pure point spectrum, that is, o(T) = Po(T). If [T] is the matrix representing T 
relative to a basis or coordinate system {x,,...,x,} of X, then we have 


Po(T) = {AEC: det(Al — [T]) = 0} 


which we know is nonempty and contains at most n complex numbers A, see 
Section 4. 

This example demonstrates why the reader, who is familiar with finite-dimen- 
sional linear operator theory only, may never have been confronted with the con- 
tinuous and residual spectra before. J 


Shift operators on sequence spaces have many applications. For example, z- 
transform techniques are based on shift operators. The next four examples develop 
the spectrum for four important cases. 


EXAMPLE 2. (RIGHT SHIFT ON /,(—00,00).) Let H be the Hilbert space 
/,(— 0,00) with the usual inner product. 
We define the right shift operator S,: H > H by y = S,x, where 


Vi = Xk=-1> k=..., —1,0, Dice er svens (6.6.1) 


for x = {...,X_4,X9,%1,..-} and y= {...,¥_4,¥o,)1,..-}. That is, S, shifts the 
sequence x to the right by one position. Needless to say, this operator is a building 
block with which difference equations are formed. For example, the difference 
operator A defined by 


Ax = x, + 2x,_-1 + Sx,_-2, id. eg Sy Ole Doses 
can be written 
A = S° + 2S, + 5S?, 


where we use the convention S? = J. 
First let us see where AJ — S, is one-to-one. Let x be a point in H such that 


(AI — S,)x = 0, 
(hat is, 
AX, — X,-1 = 0, he= sae S100 1 2 eoire 
If A =0, it is obvious that S, is one-to-one. If A 4 0, x is of the form 
MS ee sO N OCLC ON” aa 


where c is a constant. But for arbitrary nonzero A, this sequence is in Honlyifc = 0. 


416 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


Therefore, AJ — S, is one-to-one for all 7. It follows that the point spectrum of 
S, 1S empty. 

Next let us consider the range of (AJ — S,). We consider three cases: |A| > | 
[A| < 1, and |A| = 1. 


lA] > 1 Case 


In this case the range of (AJ — S,) is all H. Indeed, let y = {y,} be any point 
in H. We can show that x = {x,}, where 


Ve. Ye-1 | Ve-2 
aera 72 + 73 ats. dees le. 1D ase (6.6.2) 


is a preimage of y. First let us show that x is in H. Since {y,} is in H and |A| > 1, the 
infinite series (6.6.2) converges for all k. Then 


fate ELE Sa RS 
kN k poe jiti Riri ’ 


where the bar denotes complex conjugate. Changing the order of summation, a step 
that can be justified, yields 


N 


2 rena Vent . 


00 


1 
= pal” 7 eI, 


But using the Schwarz Inequality one has 


a nab) © Del? 


= ee 
y Ve—j Ve-1| S 
N 


k= 


for all N. So 


ee) 00 1 2 © 
al” s [ ai] ; Y yal’. (6.6.3) 


k= j= =—o© 


That is, {x,} is in H for arbitrary {y,} in H. Noting that 


toy — oy = aE Ste} wa 1 Vk- Ptod=y 

A A 

we see that x = {x,} is a preimage of y = {y,}. We have shown, then, that A/S, 
maps H onto itself for all |A| > 1. We have also exhibited the inverse of AJ — S, in 
(6.6.2). Moreover, (6.6.3) shows that this inverse is continuous. Hence, any 4 with 
|A| > 1 isin the resolvent set of S,. 


|a| < 1 Case 
In this case again the range of AJ — S, is all of H. In fact, if y = {),} is an 
arbitrary point in XY, then its preimage x = {x,} is given by 


x, = — Year — AVer2 — AP Yeta ot Ke 9335105 ly 2508s 


6.6. EXAMPLES OF SPECTRA 417 


This can be shown using an argument that is analogous to the argument used above 
in the |A| > 1 case. Similarly, (AJ — S,) has a continuous inverse defined on X. 
Hence, any / with |A| < 1 is in the resolvent set of S,. 


(It is interesting to note that in the |A| > 1 case, x, is independent of y, for n > k, 
whereas in the |A| < 1 case x, is independent of y, for 2 <k. In linear systems 
theory one would say that in one case the inverse (AJ — S,)~* is causal and in the 
other it is anticausal.) 


[a] = 1 Case 


This case is slightly more complicated. First let us consider the A = 1 case. To 
begin with, the range of J — S, is not all of H. Indeed, if x = {x,} is any point in H 
and if y, is defined by 


Ve = Xe Xk-1> 
then it follows that 
M 
PEL = (xX_y — X-y—1) + (X- 41 — Xn) $0 + (Xe — Xm—2) + (Xu — Xu-1) 
= Xn oN oN AS 


Since x € H one has x, 7 0 and x_y_; ~0 as M— oo. Hence, 


Let M denote the subspace of H defined by 


M = a cae = y= 0}, 


This is a proper subset of H and the range of J — S, 1s contained in it; therefore, the 
range of J — S, is not all of H. Nor, as a matter of fact, is the range of J — S, all of 
M. For example, let {y,} be the sequence 


y,=90 fork <0 


Yo=l 


418 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


This sequence is M, but a corresponding sequence {x,} satisfying 


Ve = Xe XKH1 
must be of the form 


xX,=c fork <0 


Xo=lt+e 


xX, = +C 


1 
Jeti 
ates 
where c is a constant. But for no constant c is the sequence in H, therefore, J — S, 
does not map H onto M. 


It turns out that the range of J — S, is the subspace R of M made up of all 
sequences {y,} such that 


00 


me Vic F# Vea + Ve-2 $0 |? < 


=—-@® 
or, equivalently, 


00 


> Weer tIVet2 + Veta to77|? <0. 


=—-@®@ 


The fact that each term in these series exists and that the two series have the same 
limit follow from the relationship 


y Vr = 0. 
k=—o 
Let us now show that R > AJ — S,). Let {x,} be any point in H, and define y, by 
XE = Ve + XK-1 = = 1 Oe 
One then has 
Ny = Xe-n—-a TVR tT Ve-1  V-Ne 
Since 


lim XK-N-1 = 0, 


No 


one gets 


X= 2 Veni = Vet Vena ts k=...,—1,0,1,.... 
i= 


6.6. EXAMPLES OF SPECTRA 419 


Therefore, {y,} is in R. Next let us show that R c AU — S,). Let {y,} be any point 
in R and consider the sequence 


Xp = FV Ven Fo = Mea — Veta’ 


Since {y,} is in R, {x,} is in H. Moreover, {x,} is obviously a preimage of {y,}. 
Hence, the range of (J — S,) is R. 

It can be shown that R is dense in H. Let {z,} be an arbitrary point in H. Then 
given an € > 0, there exists an integer N such that 


-N-1 00 
y |%|7<e and Y |z17 <6. 
k=— 0 k=N+1 


Let {y,} be a sequence in H such that y, = z, for -N <k < N. Further let 


Let K be an integer such that 


Then let 


C C 
Yn+ei = ~ Rr INt2 = ~ Re YNaK = Kk 


It follows that 


a 2 lc}? 
ly,“ = K—<e. 
kW K? 


Let all other entries in the sequence {y,} be zero. It is easily seen that {y,} isin R 
and that 


00 
y [Zn — Yel? < 3e. 
k=— 0 


Hence, R is dense in H. 

Thus we know what the range of J — S, is, and we know how to represent the 
inverse of J — S, defined on its range. Let us now show that the inverse is not 
continuous. Let y™ be the following sequences with 4N” nonzero entries: 


+1 +1 +1 —1 -1 —1 


N _ ee a Na aes ee ¥- 2S 
y =|...,0,0,55, NT an aN’ IN?" ON’ 


0,0, +} 


It can be seen that y” is in the range of J — S, for each N = 1, 2,... and ||y%|| = 1. 
On the other hand, 


(I— S,)~'y™ = x", 


420 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


where 
x,’ =0 for k< —2N?, 
, 1 2 ec ON? 
X—2N2 ~ aN” ~2N2+1 = JAP? 9X4 = IN” 
2N? 2N? —1 1 
Xon = ap a = ag te MBN = Fy 
x =0 for k>2N’. 
Since 


1 
|x|? = aA {1 +274 374---+4(2N7)*} 


— 1 2N7(Q2N? + 1)(4N? + 1) 
~ 2N? 6 

by Exercise 1.5.5, it follows that ||(J — S,)~*y%|| > 0 as N— oo. Thus, (I— S,) '! 

is not continuous, and 4 = 1 is in the continuous spectrum of S,. 


Finally, let us show that each A with |A| = 1 is in the continuous spectrum of 
S,. Again we have 


Vp = AX, — Xp~ 15 = sce el Ol sas 
where |A| = 1. Let x, = A~*z, and y, =A~**'w,, then 
Anktly, = Aretlz, — Ack dy, 
or 
Wy = 2, — Ze-4- 


In other words, we have the 4 = 1 case again. Thus |A| = 1 is in the continuown 
spectrum of S,. The situation is sketched in Figure 6.6.1. The point and residual 
spectra areempty. Jj 


Resolvent 


Set : 
Continuous 


Spectrum 


Figure 6.6.1. Spectrum of S, or S; on /,(— 0, ©). 


6.6. EXAMPLES OF SPECTRA 421 


EXAMPLE 3. (RIGHT SHIFT ON /,[0,00].) Weconsider the right shift again, but 
this time S, is defined on H =/,[0,00). In particular, if x = {x9,x,,x,,...}, then 


y= Sx = {0,X¢ aXyor° } 


Surprisingly enough this slight change does cause a change in the spectrum. 
First let us show that AJ — S, is one-to-one for all A. Let (AJ — S,)x = 0, that is, 


It is easily seen that the only solution to this equation is x = 0. Therefore, AJ — S, is 
one-to-one, that is, the point spectrum is empty. 

For |A| > 1 essentially the same argument as that used in the preceding example 
shows that / is in the resolvent set of S,. Similarly, a simple variation of argument 
shows that |A| = I is in the continuous spectrum of S,. On the other hand, we have a 
new situation for the |A| < 1 case. First note that if y, = Ax, — x,_1, then 

N 
>, A* yy = (Axo) + ACAX, — Xo) + APAx2 — x4) Ho + AN Ax — xy~1) 
k=0 
= ANt 1 


Xy70 as N->o. 


Hence 
> Aty, = 0. 
k=0 


Thus when |A| <1, the range of (AJ — S,) is orthogonal to A = {1,A,/’,...}. We 
can show that the range of AJ — S, is exactly the subspace M={yeH: y LA}. 
Indeed we will let {y,} be any point in M, and consider the sequence {x,} given by 


Xo = —y, —Ay, —Vy3- °° 
x1 = —y2 —dy3 — A yy 0 


Xp = Year — Vera — A Yeag 


By using the argument preceding Inequality (6.6.3) one can show that the sequence 
{x,} is in H whenever {y,} is in M. Since 


422 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


it follows that 


Xo = + 7 Yo 


1 1 
Mi eg ie GO 


| | 
Xe = 5 Va Tag Veni tH aT Vo 


ee 
3 


and this sequence is a preimage for y. Thus the range of AJ — S, is M, which is not 
dense in H, so |A| < 1 is in the residual spectrum of S,. The situation is shown in 
Figure 6.6.2. jj 


<«—— Resolvent Set 


Continuous Spectrum 


Residual Spectrum 


Figure 6.6.2. Spectrum of S, on /,[0, 0). 


EXAMPLE 4. (LEFT SHIFT ON /,(—00,00).) In this example we will consider 
the left shift operator S, on H =1,(— 0,00). The operator S;, shifts a sequence 
X = {...5X_1)X9»X>,.--$ to the left by one position. That is, if y= S,x, where 
oe {. ++ V—12V0 V1V2 > as then 

Ve = X41 Fe Oa ee 
(We note in passing that S; is the adjoint as well as the inverse of S,.) It is a simple 


matter to show that the spectrum of S, is the same as that of S, in Example 2. ‘I lw 
argument is almost the same. J 


EXAMPLE 5. (LEFT SHIFT on J/,[0,00).) Here we will consider S, defined 
H = 1,[0,00). That is, 
Si {Xo 9X 45X2 5° } = {x1,X X3 9e 8 oe 
In this case, as in Example 4, S; is the adjoint of S,. But now it is only the Icf\ 


inverse of S,. First let us see where (AJ — S,) is one-to-one. Let x € H be such that 
(AI — S))x = 0, that is, 


AX, — Xp41 =O, | a 0 ae a ae 


6.6. EXAMPLES OF SPECTRA 423 


It follows that any sequence {x,} satisfying this difference equation is of the form 
{c,Ac,A7c,...}, 


where c is a constant. A nontrivial sequence of this form is in H if and only if 
|A| < 1. Thus, for any / satisfying |J| < 1, AZ — S, is not one-to-one, and / is in the 
point spectrum of S,. Again using arguments similar to those used in the preceding 
examples, one can show that |A| = 1 1s in the continuous spectrum of S, and |A| > 1 
is in the resolvent set. The situation is sketched in Figure 6.6.3. J 


Point Spectrum 


Continuous Spectrum 


yr Resolvent Set 


Figure 6.6.3. Spectrum of S; on /2 (0, ©). 


So much for examples involving the shift operator. Another operator which 
plays an important role in linear analysis is the derivative operator. The next three 
examples treat three cases of differentiation defined in a Hilbert space. 


EXAMPLE 6. Let X be the linear subspace of L,(— 00,00) made up of all 
ubsolutely continuous functions x such that dx/dt is in L,(— 00,00). We note that 
it can be shown that_X is dense in L,( — 00,00). We consider the differential operator 
y = Dx, or 


iere 


where Z(D) = X and A(D) c L(— 00,00). Let us see first where 7J — D is one-to- 
one. Let x € X be such that (JJ — D)x = 0, that is, 


We know from the theory of differential equations that all (absolutely continuous) 
solutions of this equation are of the form 


x(t) = ce*, tEe(—00,0), 


where c is a constant. Clearly x is in L,(— 00,00) if and only if c= 0. Therefore 
(Al — D) is one-to-one for all A. 


424 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 
Next let us consider the range of (AJ — D), that is, all y such that 
dx 
= Ax — — a.e. 
y a 


where, recall a.e. is the abbreviation for “‘almost everywhere.” 
First consider the case Re A # 0. Let y be an arbitrary point in L,(— 00,00). We 
claim that a preimage of y is given by 


t 
x(t) = — | e'-Dy(7) dt, for Rea <0 (6.6.4) 
and by 
x(t) = + | e'-y(t) dt, for ReA>O. (6.6.5) 
t 


Indeed, by differentiating (6.6.4) we get 


= =-) [ eV) dt — y(t)=Ax(t)— y(t) — (a.e.) 


and by differentiating (6.6.5) we get 


2 = +A { ef! (7) dt — y(t) = Ax(t) — y(t) (a.e.) 
t 

So we see that (AJ — D) is a mapping of X onto L,(— 00,00) for all 2 with Re 1 # 0), 
Moreover, (6.6.4) and (6.6.5) are representations of (AJ — D)~! and it can be shown 
that both correspond to continuous transformations. Thus, any 4 with Re 2 # 01s 
in the resolvent set of D. 

Now consider the case Re A = 0. First, we let 2 = 0. We then want to solve the 
equation 


d 
y(t) = — — (a.e.) 


Next, we note that 


b b dx 
mo dt = fe dt = x(—a) — x(b). 


However, we can show that lim x(b) = 0 and lim x(—a) =0 (see Exercise 1S, 
b> 0 ao 


Section 5.22). Thus, every point in the range of D satisfies the condition 
N 
lim y(t) dt = 0. 
N~+oao *-N 


Let M be the linear subspace defined by 


M =| yeLz(—00,00): lim fx dt - 0} 


6.6. EXAMPLES OF SPECTRA 425 


Since the range #(D) is contained in M and M is a proper subspace of L,(— 00,0) 
D is not a mapping of X onto L,(— 0,00). Further, it can be shown that @(D) is 
not all of M. In fact, let 


0 fort<0 
yt)=|—-2 forO<t<1 


1 
FEY) forli<t<o. 


The function y is in M. Now if y has a preimage x in X 


dx 


= dt =y (a.e.) 


which implies that 
x(t) = cy =constant, t<0O 
x(t) = Co — 21, 0<t<l 
x(t) = cy — 2071/7, 1<t. 
Obviously such an x(t) is not in L,(— 00,00) for any choice of constants. So y does 


not have a preimage in X. It can be shown that &(D) is the subspace of M made up 
of all y such that 


[9 dt 


is in L,(—00,00). Moreover, it can be shown that this subspace is dense in 
I..(— 00,00). Finally, it can be shown that the inverse of D defined on its range can 
cither be represented by 


x)= —-[ (ade 


or 
x(t) = + | ae) dt. 


Since the mapping is not continuous, it follows that A =0 is in the continuous 
spectrum of D. 

If Re A =0, but 2 40, we can transform the problem to the 4 = 0 case just 
considered. Let A = iw; then the equation 


a dt 

can be changed to 
dv 
u(t) =o a 


by setting y(t) = e"u(t) and x(t) = e''v(t). 


426 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


Resolvent Set 


Continuous Spectrum 


Figure 6.6.4. Spectrum of D on L2 (— ©, ©). 


In summary, then, Re A #0 is in the resolvent set and Re A= 0 is in the 
continuous spectrum of D (see Figure 6.6.4). Jj 


EXAMPLE 7. Here we let X be the linear subspace of L,[0,00) made up of all 
functions x that are absolutely continuous on [0,00) and such that dx/dt is in 
L,[0,00). Moreover, we require that x(0) = 0. Since we will be interested in tho 


operator D = d/dt, this latter requirement simply limits the solutions we consider of 
the differential equation 


= Ax — — 
“ dt 
to those with zero initial conditions. 
Since x(0) = 0, the only solution to the homogeneous equation 


in X is the trivial solution. It follows that (AJ — D) is one-to-one for all 2. A simple 
change in the argument of the preceding example shows that all 4’s with Red - 0 
are in the continuous spectrum of D. It is also easily shown on the basis of the 
preceding example that all 4’s with Re 4 < 0 are in the resolvent set of D. In fact, in 
this case the inverse of (AJ — D) is represented by 


t 
x(t) = — [eva dt, 


and this is a continuous transformation. The situation is different now for Re 4 :- 0), 
One might be tempted to say that 


x(t) = +| e*@— W(t) dt, (6.6.0) 
t 


which is a bounded linear transformation defined on L,[0,00), is the inverse of 
(AI — D) for Re 2 > 0. However, it is not necessarily the case that one has 


x(0) = + \, e~ y(t) dt = 0. (6.6.1) 


6.6. EXAMPLES OF SPECTRA 427 


As a matter of fact, the range of (AJ — D) is not even dense in L,[0,00). We claim 
that the range is the subspace R of L,[0,00) that is orthogonal to e~ “'. Indeed, if 


dx 
Yaa: 


then x(0) = 0 implies that 


el ax =~ = dt =0. 
0 dt 


So AAI — D) <R. Going the other way, if y € R, it follows immediately that (6.6.6) 
yields a preimage x ec X. So Rc &(AI — D). Thus R = AAI — D). Since R is not 
dense in L,[0,00), it follows that 2’s with Re 2 > 0 are in the residual spectrum of 
D. The situation is sketched in Figure 6.6.5. J 


Re dA =0 


Continuous Spectrum 


Resolvent Set Residual Spectrum 


Figure 6.6.5. Spectrum of D on L2 [0,0), with Boundary Condition x(0) = 0. 


EXAMPLE 8. This example is the same as the preceding one except for the fact 
that x(0) is no longer required to be zero. It is easily shown that each A with 
Re 2 = 0 is in the continuous spectrum of D. However, everywhere else things are 
different. For Re 2 < 0, e* is a nontrivial solution of 


(AI — D)x = 0. 


So each A with Re A <0 is in the point spectrum of D. However, for Re A> 0, 
Al — D has a continuous inverse defined on L,[0,00) that can be represented by 


x(t) = + | et) yr) dr. 


So each 2 with Red > 0 is in the resolvent set of D. The situation is shown in 
igure 6.6.6. jj 


We invite the reader to note the interesting parallels between the shift operators 
wnd the derivative operators. 

The next example shows the connection between the spectrum of a linear time- 
invariant system and its frequency response. 


428 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


RerA=0 


Continuous Spectrum 


Point Spectrum Resolvent Set 


Figure 6.6.6. Spectrum of D on L.[0, ©), with No Boundary Condition. 


EXAMPLE 9. Let H = L,(— 0,00) and let T be the bounded linear mapping 
of A into itself defined by 


(Txt) = [ 


h(t — t)x(t) dt, (6.6.8) 


where A(t) = 0 for t < 0 and 
h(t) = A,e*!' +-:-+A,e" fort>0 


and the coefficients A,,..., A, are complex numbers and the exponents a,,..., a, 
are distinct complex numbers in the left-hand plane. Let H(s) denote the one-sided 
Laplace transform of A(t), that is, 


H(s) = fhe dt. 


Then we assert that the continuous spectrum of T is given by 
Co(T) = {A: 4 = H(i) for some extended real number o}. 


We say extended real number because H(co) = 0 is also a point in the continuoun 
spectrum. Further, we assert that the rest of the complex plane is in the resolvent 
set. A typical situation is shown in Figure 6.6.7. We shall merely sketch the proof of 


Figure 6.6.7. 


6.6. EXAMPLES OF SPECTRA 429 


these assertions here because this proof is very similar in spirit to that used in 
Example 6. We note that A(s) is a rational function, that is, H(s) = P(s)/Q(s), where 
P(s) and Q(s) are polynomials. It follows that 


1 AS) 
A—H(s)  AQ(s) — P(s) 


Using the partial function expansion, we have 


1 OS), BM BM 
= = Bt) + + 7 
A—H(s) AQ(s) — P(s) S + Uy(A) S + U,{A) 

(Implicitly we have assumed that we do not have repeated roots for the above 
value of A. In general, this is not the case, and the partial fraction expansion be- 
comes slightly more involved. However, we ignore this detail here.) 

Assume in (6.6.8) that —y,(A),..., —y,(A) are in the left-hand plane and 
— [ya (A), «.+, —H,(A) are in the right-hand plane. Then (AJ — T) has a continuous 
inverse defined on all of L,(— 00,00). That is, 2 is in the resolvent set. This inverse is 
represented by 


(6.6.8) 


t 
(Al = T) *y = BoA) y(t) -+- B,(A) { e MAMET (7) See 
t 
+ BA) [eM AE“OY(t) dt t+ 
tt Busi) | en He“ Oy (7) dz pire 
t 


BG) if eM“ Dy(2) dr, 
t 


| lowever, if one of the roots is on the imaginary axis, that is, if A = H (iw) for some 
ieal number @, then (AJ — 7')~* is not continuous. Similarly, 7 ~+ (that is, 2 = 0) 
isnot continuous. § 


MXERCISES 


|, We will let H, be the Hilbert space made up of all doubly infinite sequences 
x = {...,.X_1XpX2,...} such that 


00 
be [x,|7r77* <0, 
k=—-© 


where r is a real number and the inner product is given by 


00 


(x,y) = de Xe Ver 


2k 


Discuss the spectra of the left and right shifts on this space. How do these 
spectra behave as a function of r? 


430 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


2. Let /,, denote the Hilbert space made up of all sequences x = {x9 ,x,,X5,...} 


10. 


such that 
[o @) 
> [x,|7r7 7* <0, 
k=0 
where r is a real number and the inner product is given b 
p g y 
2k 


(x,y) a > Xe Ver 
k=0 


Discuss the spectra of the left and right shifts on this space. How do these 
spectra behave as a function of r? 


. Let L,°(— 00,00) denote the Hilbert space made up of complex-valued func: 


tions x’ such that 


(o.8) 
{ |x(t)|e~ 2°" dt < 00, 
—- 0 
where o is a real number and the inner product is given by 


(x,y) = | 


Discuss the spectrum of the derivative operator D = d/dt, where the domain ol 
D is a proper subset of L,°(— 00,00) defined as in Example 6. 


” x(t)y(t)e72% dt. 


. Similarly to Exercise 3, consider operator D defined in the Hilbert space 


L,°[0,00). Assume that x € Z(D) implies x(0) = 0. (See Example 7.) 


. Repeat Exercise 4 with the boundary condition x(0) = 0 removed (see Example 


8). 


. Discuss what happens if, in Example 9, L,(— 00,00) is replaced by L,[0,:#)), 


[Hint: Review Examples 6, 7, and 8.] 


. Let X=L,(/), where J is an interval. Let f(t) be a continuous complex-vul 


ued function defined on J. Define F: 9(F)—-X by y= Fx, where y(1) 
f(t)x(t) and GCF) = {x © X: fx € X}. Show that if f(t) is bounded, then /" in 
bounded. Assume that for each A the set {tEJ: f(t)=A} has Lebesgue 
measure 0, then show that the spectrum of F consists of continuous spectrum 
only and that Co(F)=f(J). 


. Show that if a linear operator has a finite-dimensional range, then its entire 


spectrum is pure point spectrum. 


. Discuss the spectrum of the following operators on /,(— 00,00): 


(a) S, + 21. 

(b) S. + AL. 

(c) BS, + AI, where B # 0. 

(d) S,?. 

(ec) aS,? + BS. + 21, where a 40. 


Repeat Exercise 9 but now on /,[0,0). 


6.7. PROPERTIES OF THE SPECTRUM 431 


11. Repeat Exercises 9 and 10 for the operator S,. Do your answers suggest a 
general theorem ? 


12. Consider the operator T= S, + S, on 1,[0,00). Show that o(T) is the real 
interval [—2, 2] and that o(T) = Co(T). 


13. Consider the operator ® on /,(0,00) given by 
D(x 1,X2 a8 .) aa (P(1)x,, p(2)x2 a )s 


where sup |@(n)| < 00. 


(a) Show that ¢(n) € Po(®) for all n and that o(®) is the closure of Po(®). 
(b) Is ® self-adjoint? It not, when is ® self-adjoint? 
(c) Is ® normal? If not, when is ® normal? 


14. Consider the operator L = 7+ ® on /1,(0,00), where T and ® are given in 
Exercises 12 and 13. Assume that ¢(n) is real-valued. 
(a) Show that if 2 € o(L), then |[A| < 2 + ||®||, where ||®|| = sup |¢(»)|. 


(b) Show that L is self-adjoint. 

(c) Show that L is not compact. 

(d) Assume that $(n) is real-valued and that @(n) — A) as n- oo. Show that 
Co(L) is the interval [—2 — Ay, 2 — Ay]. What is o(L)? 

(e) Assume that ¢(n) > 0 and that |¢(n)| > <2 for some n. Show that L has at 
least one eigenvalue. 


7. PROPERTIES OF THE SPECTRUM 


Let JT be a continuous linear transformation mapping a complex Banach 
space into itself. In this section we show that the spectrum of T is a compact subset 
of the complex plane lying in the closed ball {z: |z| < ||7'||}. It follows, then, that if 
|A| > ||7||, then A is in the resolvent set p(7). We can see from the examples of the 
preceding section that this result does not hold for discontinuous operators. 

The following theorem is useful in reaching the desired result. 


6.7.1 THEOREM. Let T be a continuous linear transformation of a Banach space 
\ into itself such that \\T || < 1. Then I — T)~* exists and is continuous. Moreover, 


(I Ty tal T+ 4 Tete YT (6.7.1) 

where the convergence is in terms of the uniform topology and 
I -T) "| <= ITI). (6.7.2) 
Proof: Recall that B/tl.X,X] denotes the normed linear space made up of all 


continuous linear mappings of X into itself, and the norm on B/t(_X,X ] is the opera- 
tor norm, Since X is complete, B/t[X,X] is complete. 


432 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


Since ||T"|| < ||7'||" and since ||7'|| <1, it follows that the series ) 79 7” is 
absolutely convergent, that is, ) "9 ||T""|| is convergent. It is then a consequence of 
Theorem 5.4.2 that the series ) 9 T” is convergent. 

Let Sy = )*_, T", and S,, = )~) T”. We want to show that S,, is the inverse 
of (J — T), that is, 

(I—T)S,, =S.(1-T) =1. 
However, 
(I—T)Sy=1—T**! 1, 
since |7'%*!]| -0 as N— oo. On the other hand, 
(I—T)Sy>U-T)S.Q; 
since 
| — T)Sy —U-T)S,|| < || — T° Sw — Soll > 9. 


Since limits are unique, one has (J—T)S,,=J. Similarly one shows that 
S.(I — T) = I. The proof of the inequality (6.7.2) is left as an easy exercise. J 


Please note that this theorem does not say that ||7' || < 1 is a necessary condition 
for the existence and continuity of (J — T)~*. It merely says that it is a sufficient 
condition. 


6.7.2 THEOREM. (NEUMAN EXPANSION.) Let T be a continuous linear mapping 
of a Banach space X into itself. If |A| > \|T'\|, then 4 is in the resolvent set of T. More- 
over, (AI — T)~* is given by 


deta 7 (6.7.3) 
n=0 
and 
WAr—T)"*) <(al— IT)’. (6.7.4) 


Proof: We want to show that (AJ — 7’) has a continuous inverse. But 
1 1 
(1) => (1-57). 
A A 


Since ||(1/A)T'|| < 1, it follows from the previous theorem that (J — (1/4)T) has a 
continuous inverse. Hence, 


1 = 1 oe) 
(AI —T)'= +I(I = r) =e ame! 
n=0 
which is continuous. The proof of (6.7.4) is left as an exercise. J 


We see, then, that the resolvent set of a continuous operator is not empty and 
the spectrum is bounded. 


6.7. PROPERTIES OF THE SPECTRUM 433 


6.7.3 THEOREM. Let T be a continuous linear mapping of a Banach space X into 
itself. If 46 p(T) and |p| < |\(Al—T)7'\\7*, then 2+ we p(T). In particular, the 
resolvent set p(T) is an open set in the complex plane. 


Proof: Let A be any point in the resolvent set p(T). Since 
A+ pl—-T=(AI-T){I+ pl — T)~*), 


it follows from Theorem 6.7.1 that [7+ w(Al—T)~*] has a continuous inverse 
provided 


IMA — 7)" = |e Wr -—T)* I <1. 


Since (AJ — T')~' is continuous, the theorem now follows from Theorem 6.7.2. Jj 


In summary, then, we can say the following: 


6.7.4 THEOREM. Let T be a continuous linear transformation of a complex 


Banach space X into itself. The spectrum of T is a compact subset* of the complex 
plane lying in the closed ball {z: |z| < \|T ||}. 


There remains one technical point to be mentioned; namely, the spectrum of a 
bounded linear operator on a (nontrivial) Banach space is never empty. The reader 
interested in the proof of this general fact is referred to Taylor [2, p. 261]. We shall 
show that this is the case for compact normal operators (Theorem 6.10.16). 


KXERCISES 


1. One can actually show that the spectrum of a bounded linear operator T lies 
in the circle {A: |A| < lim sup ||T"||!/"}, which is contained in {A: |A| < ||7'|}. 
The following steps lead to a proof of this fact: First show that whenever the 
series in (6.7.3) converges, it converges to (AJ — T)~. Use standard real analysis 
to show that the series °° |A|~"~' ||T"|| converges for |A| > lim sup ||7'"|\"””. 
This implies that (AJ — T)~! exists and is continuous for |A| > lim sup |7"||'/". 

2. We will let p(r) be a polynomial in r with complex coefficients, that 1s, 
let p(r)=agr"+-+::+4a,, where ag #0,n>0. Let L:X— X be a bounded 
linear operator on a complex Banach space X and define p(L) by 

P(L) = aol" +°*° +a, 1. 
Show that the spectra of L and p(L) are related by 
o(p(L)) = p(o(L)) = {p(4): 4 € o(L)}. 


3. Let L: ¥— X be a bounded linear mapping of X onto X and assume that L~* 
exists and is continuous. Show that 


o(L~*)=o0(L)"' = FF he a(L)}. 


4 Recall that a subset of the complex plane is compact if and only if it is closed and bounded. 


434 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


4. Let L: X— X be a linear operator, where X is a Banach space. The spectral 
radius of L is defined by 


r,(L) = sup{|A|: 2 € o(L)}. 


(a) Show that r,(Z) < ||L||. 

(b) Show that if L is the matrix operator (}_ }), then r,(L) < ||L]|. 

(In Exercise 3, Section 6.11 we will show that if L is a compact normal 
operator on a Hilbert space, then r,(L) = ||L||.) 


5. Let L be a bounded linear operator on a Hilbert space H. We define tho 
numerical range of L to be the set of complex numbers 
W(L) = (Lx, x): ||x|] = 1}. 


(a) Show that the spectrum o(L) lies in W(L). 

(b) Show that if dA, W(L)) = p > 0, then ||(AJ — L)~1|| < 1/p. 
(c) Show that W(L) is convex. 

(d) Show that 


W(aL + BI) =aW(L) + 8B 
when a # 0. 
(ec) Show that W(ULU ~*) = W(L) when U is a unitary operator. 


6. Let L be a bounded linear operator on a Hilbert space H. Let x and y be fixed 
elements in H. 
(a) Show that the function 


f(A) = (Al — L)*x, y) 
is an analytic function for Ae p(L). Let R, = (AI—L)7}. 
(b) Show that for A, uw in p(L) one has 
R, — RK, =(—AR,R, 
R,R, = R,R,. 


(c) Show that if w € p(L) and |p — A| ||R,|| < 1, then 2 € p(Z) and 
R= Yu AYR 


(d) Show that 


d" 
a R, = (—1)"n! R37. 


[That is, R, is analytic for 4 € p(L).] 


7. Two sequences {x,} and {y,} are said to form a biorthonormal sequence | 
(X,>Y¥m) = 0 for n #m and (x,,y,) = 1. A biorthonormal sequence is said { 
be maximal if finite linear combinations of the x,’s, as well as those of the )’,'n 
are dense in the basic Hilbert space H. 


I]. 


6.7. PROPERTIES OF THE SPECTRUM 435 
Let {x,} and {y,} be a maximal biorthonormal sequence in a Hilbert space 


(a) Show that if (x,y,) = 0 for all 2, then x = 0. 

(b) Show that if (x,,y) = 0 for all n, then y = 0. 

(c) Show that if >, (x,y,)%, OF )\, (X,%_)¥_ Converges, then x = ) |, (X,Y,)X, OF 
x=), (x,x,)¥_, respectively. 


. Let {e,} be an orthonormal basis in a Hilbert space H and let {x,} be a sequence 


in H with the property that there is a 0, 0 < 0 < 1, such that 


» An(En = Xn) 


2 
< 67 ¥ |a,|? (6.7.5) 


for every sequence {a,} of complex numbers. 
(a) Show that 


Kx = J (x,€,)(€n — Xn) 
defines a bounded linear operator on H with ||K|| < 0. 
(b) Let 7 = J — K and show that 
(1 — 6) ||x|| < |7x| <Q +9) |x], xed. 


(c) Show that x, = Te,. 

(d) Let y, = (T *)*e,. Show that {x,,} and {y,} forms a maximal biorthonormal 
sequence. 

(e) Show that for each x e H one has 


° as y (X,Xn) Vn and x= 3 (X,Yn)Xn : 


. Let x,(t) = (22)7 '/? exp(id, t), n = 0, +1, +2,... for —n <t <7. [Recall that 


e,(t) = (2n)~'/* exp(int), n=0, +1,... forms an orthonormal basis for 
L,(—17, 2).] Assume that 


log 2 
M =sup |2, — n| <———. 


Show that {x,} and {e,} satisfy (6.7.5) with 6 = e“@*" —1 <1. 


. We will let L, S, and 7 be bounded linear operators on a Hilbert space H where 


L=T+S. Assume S~‘ exists and is compact, and that ||7'|| ||S~+|| < 1. Show 
that L~! exists and is compact. 


Let W be a density operator on a Hilbert space H. (That is, W is a self-adjoint 

with 0 < W? < Wand tr W = 1. See Section 5.25.) 

(a) Show that the spectrum o( W) lies in the interval [0,1]. 

(b) Show that if 1 e o(W), then o(W) = {0,1} and W represents a pure state. 

(c) Show that if A is an eigenvalue with 4 <1 < 1, then dim /W(W — AD = 1. 

(d) Show that if A is an eigenvalue with (n+ 1)>'<A<n™', then 
dim W(W — Al) sn. 


436 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


12. 


13. 


14, 


15. 


Prove Inequalities (6.7.2) and (6.7.4). In Theorem 6.7.3 estimate 
IA + wl- TI | 
in terms of |u| and ||(AJ — T)7?]. 


Consider the Volterra integral operator K given by 


y(t) = f Keo) dt, 


on (C[0,7), ||: ||,,), where ||- ||,, denotes the sup-norm. Assume that k(t,t) is 
continuous for 0 < t < t < | and satisfies |k(t,t)| < M. 

(a) Show that if x e C[0,T], then |Kx(t)| < Mt||x||,. 

(b) Show that 


|K"x(t)|S(M"t"/n!)|lxll,. 
(c) Show that 
T 


<m"(s] 


"| 


nt 


(d) Show that (AJ — K)~* =)", 47""'K", for all 1 # 0. That is, the Neuman 
series converges for all 1 # 0. 
(e) Show that o(K) = {0}. 


Consider the Volterra integral operator K in Exercise 13 but now on the space 
L,[0,7]. Show that o(K) = {0}. 


(Perturbation Theory). Suppose we wish to solve the equation 
Ud — K)x=y 
for a given y in a Banach space X. Assume that (J — K)~! exists and is continu- 
ous. Assume further that there is another operator Ky near K and that the 
equation 
(I — Ko)x9 =y 


is easier to solve. In this exercise you are asked to show that the “approximate "’ 

solution x, is close to x in the following precise sense: 

(a) Assume that ||K — Ko|| < 6 and 6||(J — K)~'|| < 1. Show that (J — Ky) has 
a continuous inverse and that 


|x — Xl] < dI|(7 — Ko)" Iv. 
[Hint: Show that x — x9 = (I — Ky)" '(K — Ko)y.] 
(b) Assume that 6 ||(I — K,)~!|| <r < 1. Show that 


r 
|x — xoll S$ — lleoll 


6.7. PROPERTIES OF THE SPECTRUM 437 


16. (a) Use Exercise 15 to find an approximate solution to the problem 
0.5 
x()=[ k(t, 2)x(2) de + S(O, 
0 


where k(t,t) = sin(tt) and f(t) =t7 ‘(cos t/2—1) +1. Use the integral 
equation 


0.5 
x(t) = | ko(t.z)xo(t) dt + £0, 


where k (t,t) = tt. [Hint: Show that 
X(t) = Ct + f(t) 


for an appropriate choice of C.] 


(b) Show that ||x — x || < 0.002, where the norm is the sup-norm. 
(c) Use the above information to guess the solution x(t). 


Part C 
Spectral Analysis 


8. RESOLUTIONS OF THE IDENTITY 


In the first part of this chapter we studied bounded linear operators that could 
be expressed as a finite linear combination of projections. The projections formed a 
‘resolution of the identity.”’ The study of more general operators will require an 
expanded version of this concept. That is, we need a concept of a resolution of the 
identity which will allow us to treat infinite collections of orthogonal projections 
instead of only finite collections. This, however, raises a very important conver- 
gence problem, which we have seen once before, see Section 5.8. Let us illustrate this 
problem by means of a simple example. 


EXAMPLE 1. Consider the Hilbert space /, = /,(0,00) and let {e,} be the com- 
plete orthonormal basis for /, defined by e, = (6,,,62,)...), that is, e, = (1,0,...), 
and so on. 
Define P,,: 1, > 1, by 


PX = (x, en )ens w= 1.2. By eae 
Each P,, is, of course, an orthogonal projection on /,. Moreover, we have that 
P;P;=0 for i#/j. 


And if x is any fixed element in /, , then by the Fourier Series Theorem one has 
Ix=x= Y (x, e,)e, = > P,x 
n=1 n=1 


= Pix Pax i, 


where the convergence is in terms of the norm on /,.. It may appear, then, that one 
can write the identity operator as 


PSP Pye) P... (6.8.1) 


However, this is an example of a series that converges strongly but not uniformly 
(see Section 5.8). That is, if we set 
N 


Sy= > P,. N Syl 24s 


n=1 


then /=, lim Sy, but ||Sy — /|| = 1. In order to prove that ||S, — /|| = 1, we use the 


N-* a0 


439 


440 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


Fourier Series Theorem to note that 


I — Sy)x||* = 


[o @) ie @) 
= ¥ |(x,e)I7 < ¥ I(x, e,)|7 = ||x\l?. 
n=N n=1 


oO 
» (x, een 
n=N 


Hence, we see that ||J — S,|| < 1. However, 
(7 - Swen +1 = |\(0,. s° 50, 1,0,. ° .) || oo 1, 


Hence ||J — S,|| = 1 for every N. Therefore, it is not true that {.S,} converges to / 
in terms of the uniform topology. 
We see then that (6.8.1) is not correct but that one does have 


1=,YP,. | (6.8.2) 


We are now ready to define a resolution of the identity. 


6.8.1 DEFINITION. A sequence of operators {P,,} on a Hilbert space H is said 
to be a resolution of the identity if it is true that (i) each P, is an orthogonal projec- 
tion, (ii) P, P,, = 0 if n 4m, and (iil) 


I=.) P,. 


n 


We do not rule out the possibility that {P,,} may be finite. Therefore, Definition 
6.8.1 includes Definition 6.2.1 as a special case. The above definition is not the most 
general definition of a resolution of the identity. In particular, when one considers 
noncompact linear operators, resolutions of the identity involving uncountable 
sets of projections are needed. However, for present purposes this definition is 
general enough. 

Let us now show the relation between a resolution of the identity {P,} on a 
Hilbert space H and the corresponding subspaces {A&(P,)} of H. 


6.8.2 THEOREM. (a) Let {P,} be a resolution of the identity on a Hilbert space 
H and &(P,,) be the range of P,,. Then &(P,,) L A(P,,) for n# m and 
H=) &(P,). 
(b) Conversely, let {R,} be a collection of closed linear subspaces of a Hilbert 


space H with R, 1 R,, form # nand such that H = _,, R,. Let P,, be the orthogonal 
projection of H onto R,,. Then {P,} is a resolution of the identity on H. 


Proof: (a) Let xe &(P,) and ye A(P,,), where n # m. Then since P,, is self 
adjoint and P,, P,, = 0 one has 


(X,Y) = (Pa XsPimnV) = (Pin Pa XY) = (Oy) = 0. 


6.8. RESOLUTIONS OF THE IDENTITY 44] 


Hence &(P,) L A(P,,) forn 4 m. Next let M be the closed linear subspace given by 
M=A&(P,)+ AP.) +°°° =|xeH: x= > Xn> Xn € A(P,)); 
n=1 


where we used the Orthogonal Structure Theorem 5.20.2. If x e M+, then 
x=Ix=P,x+P,x+-::=0, 


hence M+ = {0}. It follows from Theorem 5.15.4(d) that M = H. 

(b) First, it is clear that P,P, = 0 if n 4 m. Secondly, it follows from Ortho- 
gonal Structure Theorem 5.20.2 that every xe H can be written uniquely as 
X=X,+x,4+-°°:, where x,¢ER,. Thus, since P,x=-x,, it also follows that 
Ix = P,x+P,x+°:::, that is, {P,} is a resolution of the identity. J 


EXERCISES 


1. Let P, denote the orthogonal projection of L,(— 00,00) onto L,(—0,T]. Let 
T., be a Strictly monotone sequence (that ts, 7,, > 7,,_,) with 7,,— oo and define 


Q, by Qo =Pr,, Qn = Pr, —Pr,_,. Show that {Q),Q;,...} is a resolution of 
the identity on L,(— 00,0). 


2. Let P, denote the projection onto the nth-coordinate in the Hilbert space 
1,(— 00,00), and let 


n 


QO, roe a 2 P, : 
Let T be a bounded linear operator on /,(— 0,00). Show that the following 
statements are equivalent: 
(a) T is causal. 
(b) For all » one has: 


O,x = Qny=> QO, Tx = Q,Ty. 
(c) For all” one has: 
QO,x=0=>0,Tx = 0. 


(d) For alln one has T(V(Q,)) & VY(Q,). 
Where is the linearity of T required in the above list? 


3. Let {e,} be an orthonormal set in a Hilbert space H and define P,: H- H by 
PX =(X, en)en- 
(a) Show that P, is an orthogonal projection and that P,, P,, = 0 whenever n # m. 
(b) Show that {P,,} is a resolution of the identity if and only if {e,} is a maximal 
orthonormal set. 
(c) Use the Fourier Series Theorem to give other characterizations of the state- 
ment that {P,,} is a resolution of the identity. 


442 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


9. WEIGHTED SUMS OF PROJECTIONS 


In the last section we considered resolutions of the identity {P,,}. In this sec- 
tion we consider a special class of linear operators that can be constructed using 
resolutions of the identity. In particular, we consider weighted sums of projections. 

The reason for the importance of weighted sums of projections is that many 
linear operators with important applications turn out to be weighted sums of projec- 
tions in disguise. 


6.9.1 DEFINITION. Let H be a Hilbert space and let {P,,} be a resolution of tho 
identity defined on H. Further, let {4,,} be a sequence of scalars. A transformation of 
the form 


Tx) ALP eX; xe Q(T), (6.9.1) 
where 


N 
GT) =\xeH: lim 5 A,P,x exists 


N-o n=1 


is said to be a weighted sum of projections. 


Note that we do have to be careful here about the domain of 7. The obvious 
reason is that we have placed no constraints on the set {/,}. If we let 7, denote tho 
continuous linear operator 


N 
Ty = > A,P,. 2 ee eer 
n=1 


it follows from the definition that T is the strong limit of the sequence {Ty}. We do 
not rule out the possibility that 7 may be unbounded. 


6.9.2 LEMMA. A weighted sum of projections is linear. 


Proof: Let x, and x, be any two points in A(T), and let 
N N 
y,=lim ¥4A,P,%,,  yo= lim YA,P,x2. 
N>o n=1 N->o n=1 


We want to show that for any scalars a, and a,, the limit 


N 
lim y dn P(X 1X4 + Ly X>) 
N>o n=1 
exists and is equal to a,y, + 4, y,. But 


N 
OyVy + O22 — A, P Aa ,X 1 + %2 X2) 
1 


n= 


N 
Lv a de An Pu Xs 


< |a,| 


+ |a2| 


N 
i az 2 Ayn PyX) 
y= 


Completion of the proof is now left to the reader. J 


6.9. WEIGHTED SUMS OF PROJECTIONS 443 


According to our definition of resolution of the identity, it is possible that 
P,, = 0 for certain n. In order to avoid having to handle this special, yet trivial case, 
we assume here that only nontrivial projections occur in our resolutions of the 
identity. Also let us assume that H is infinite dimensional. Further, let us assume 
that {P,} is infinite. Otherwise the problems we consider next are, by and large, 
trivial. 

The next lemma provides a simple characterization of the situation where the 
domain of T is all of H. 


6.9.3 LEMMA. Q(T) = A if and only if the set {\A,|, |A,|,...} is bounded. 


Proof: Suppose first that the set {|A,|, |A,|,...} is bounded; that is, there 
exists a real number M > 0 such that |/,| < M for all n. Let us show that 


N 
Yn>= »y A, PX 
n=1 


is a Cauchy sequence for arbitrary x € H. Since {P,} is a resolution of the identity 
one has x =), P, x and ||x|/* = >, ||P,,x||* by the Orthogonal Structure Theorem. 
Hence the sequence of partial sums zy = )¥_, ||P,x||7 is a Cauchy sequence. 
Since 


N 
2 2 2 2 
Yv — Yall = alta |Paxllo <S M*|zy — Zmll, 


N 
Ay Pak 
M+1 


it follows that {yy} is a Cauchy sequence and therefore convergent, since H is 
complete. It follows then that Y(T) = H. 

Now assume that Z(T) = H. We must show that the set {{/,], |A,|,...} 1s 
bounded. We argue by contradiction, that is, we show that if {|A,], |A,|,...} 1s not 
bounded, then D(T) # H. 

If {|A,|} is not bounded, there exists a subsequence {|/,,|, |4,,[, |4,,|,..-} such 
that |A,,| = &. Corresponding to each /,, is the projection P,, with nontrivial range 
A(P,,,)- Let x,, € A(P,,) and ||x,,.|| = 1. It is easily shown that x given by 


1 
X= D> Xm 


is a point in H. On the other hand, the series 
Agee) x 
n k 


obviously does not converge since it is the sum of mutually orthogonal unit vectors. 
Therefore, x is notin A(T) and A(T) 4H. § 


We can show that 2(T) is dense in H no matter what the set {/,} ts. 


6.9.4 LemMMA, A(T) = I. 


444 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


Proof: Let x be an arbitrary point in H and let ¢ be any positive number. We 
must show that there exists an x, € YT) such that ||x — x,|| <e¢. Thus, since 
R(P,) +A(P,)+°::=H and these ranges are mutually orthogonal, we can 
write x uniquely as 

RSX Pte es 


where x; € A(P;), i= 1, 2,.... Furthermore, there exists an integer N such that 
|x — xy — Xy — ++ — Xyll <e. 


But (x, + xX. +++ + xy) EAT). Sox, =x, +--- + xy, willdo. J 


We can also easily state necessary and sufficient conditions for the continuity 
of T. 


6.9.5 LEMMA. T is continuous if and only if the set {\A,|, |A,|,...} is bounded. 
Moreover, |\|T || = sup{|A,], |A,|, ...}. 


Proof: Suppose {|/,|, |A,|,...} is bounded and M > 0 is a bound. Let x be 
any point in H. Then by the Orthogonal Structure Theorem one has 


2 


N 
|Tx|7 =|] lim > A, P,x (6.9.2) 
N>o n=1 


N 
: 2 2 
= lim > |A,?1P, x 
N>oo n=1 
N 
<M?lim Y ||P, x||? = M?I|x||?. 
N->o n=1 


So T is continuous and ||7' || < ™/. 
Next suppose that 7 is continuous. Then 
|x|] < 7 || 


for all x e H. Let x, be a point in A(P,). It follows that Tx, = 4, x, and |A,| < ||7'| 
for all n. Thus {[A,|} is bounded by ||7'||. Finally let a = sup, {|/,|}. We see that 
a < ||7||. But Inequality (6.9.2) also holds for M =a, so ||T||<«. J 


Putting the preceding lemmas together we have the following theorem. 


6.9.6 THEOREM. A weighted sum of projections T is continuous if and only If 
YT) = H. 


The spectrum of a weighted sum of projections has some particularly interesting 
properties. It is easily seen that the operator (AJ — T) is defined by 


(AI — T)x = y (A — 4,)P,, x x € Q(T). 


Thus AI — Tis itself a weighted sum of projections. 


6.9. WEIGHTED SUMS OF PROJECTIONS 445 
6.9.7 LEMMA. Al — T is one-to-one if and only if 4 # i, for all n. 


Proof: Letxe@(T) bea pointsuchthat(Al — T)x = 0.Letx =x, + x%,4+°°°, 
where x, € &(P,). Then 


QI —T)x =>) (A—A,)x, = 0. 


IfA#4,, for all n, then x, = x, =--- =0. So (AI — T) is one-to-one. On the other 
hand, if A = 4,,, for some n, then (/, J — T)x, = 0 for ||x,|| #0, so (A, /—T) is not 
one-to-one. J 


It follows immediately, of course, that the point spectrum of T is the set 


{1,,A3,...}. 
Next let us investigate the range of AJ — T. 


6.9.8 LEMMA. The range of (AI—T) is dense in H if and only if A#A,, 
WaT Jick 


Proof: If A= 4, for some 2, it is clear that the range of AJ — T is orthogonal 
to &A(P,,). Since P, #0, A(P,) is a nontrivial closed subspace and @(AI — T) is not 
dense in H. 

Next suppose that A # J, forall», and let y be any point in H. Given any e > 0, 
we must find an x, in DAL —T) such that ||y — y,|| <¢, where y, = (QI — T)x,. 
Let 


P= Niet Vase 
where y, € A(P,), n= 1, 2,.... Since y € H, there is an integer N such that 
ly —¥1 —Y2-°'* — yull <e. 


We let x, = 4, (A —4,)"y,. Then x, is in BAL — 7) and 
Ve=VMtoit+y=CUl-T)x,. | 


Lemmas 6.9.7 and 6.9.8 show us that if 2 is not in the point spectrum, it Is either 
in the continuous spectrum or the resolvent set of 7, that is, the residual spectrum 
is always empty. 

Next let us find out when the range of (AJ — T) 1s all of H. 


6.9.9 LEMMA. &AI—T) = H if and only if the set {\2A — 4,|, |A — A,|,...} is 
bounded away from zero, that is, there exists a 6 >0 such that |A — 1,| => 6 > 0 for 
all n. 


Proof: Suppose that {|A — 4,|} 1s bounded away from zero and y is any point 
in H. Let y= y, + y2 +°°':, where y, € A(P,). We assert that 


x=) (A- An) "Vn 


446 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


is a preimage of y. Moreover, the inverse of (AJ — 7) in this case can be represented 
by 


(AI = T)*y =Va = An) “Pays 


that is, (AJ —T)~' is a weighted sum of projections. We leave these details as 
an exercise. 
Now let us show that 2(AI — T) = A implies that {|A — 4,|} 1s bounded away 
from 0. Let x be any point in AAI — T) and y = (AI — T)x. By definition 
N 
y=lim Y(A-A,)P, x. 
N->oo n= 1 
Since A is not (Lemma 6.9.7) an eigenvalue, (A — 4,)~ 'P, is continuous for each n. 
Hence 


N 
(A —4,)~'P,y = lim (A — 4,)7'P, ¥ (A — 4,)P, x. 
No n=1 


From the definition of resolution of the identity the right-hand side reduces to 
P,x. Thus 


@3 = Ay) Pay = PX. 
It follows that y e AAI — T) implies that infinite sum 


A Ay PRY = 2, Pex (6.9.3) 


exists. 

Now if {|2 — 4,|} is not bounded away from zero, it is easy to pick a yin H for 
which the sum in (6.9.3) does not exist. That is, {|A — 1,|} not bounded away from 
zero implies that AAI — T) = H, or equivalently, AAJ — T) = H implies {|A — 4,|} 
is bounded away from zero. J 


In any event, if A # 4,, for all n, the operator (AJ — 7) has an inverse defined 
on its range. 


6.9.10 LEMMA. Jf 4A, for all n, (AI — T) has an inverse defined on its range 
and 


(AI — T)"ty =S(A- 4)! P,y (6.9.4) 


for ye RAI —T). Moreover, (AI —T)~* is continuous if and only if {|A —A,|} és 
bounded away from zero. 


The proof of this lemma is an obvious combination of the preceding lemmas. 

We see, then, that A is in the resolvent set of T if and only if {JA —A,|} ts 
bounded away from zero. Further, 4 is in the continuous spectrum of Tif and only if 
(i) A 4 A, for all n and (ii) {|A — 4,|} is not bounded away from zero. In other words, 
A is a point of accumulation of the set {A,} but 4 is not in {A,}. 


6.9. WEIGHTED SUMS OF PROJECTIONS 447 


So far we have considered both bounded and unbounded weighted sums of 
projections. For the remainder of this section we restrict our attention to bounded 
weighted sums of projections. 


6.9.11 THEOREM. Jf 
T ad RY 3s An ae ? 


n 


where {P,,} is a resolution of the identity, and if T is bounded, then the adjoint of T is 
given by 


TS.) Ages (6.9.5) 
where 4, denotes the complex conjugate of i, . 


Proof: Let x and y be any points in H, and consider 


N 
(y,Tx) = (, lim > 4, P,x) 
N 


—>0oo n=1 


Since the inner product is continuous, we have 


(y,Tx) = ns {(y,4, Px) ei (y,An Py x)}. 


But P, is self-adjoint, so (y,j,P, x) = (4, P, y, x) for all n; therefore, 
(y,Tx) = a {(AyPyy,x) + +++ + Ay Py y,x)}. 


Again using the continuity of the inner product, we have 
N 
(tim » An Pays x) = lim {(A,P1 3.x) + +++ + Uy Pw y,x)}- 
N->o n=1 N- 0 
Finally, then 
(y,Tx) = (T*y, x) 
for all x, ye H, where 7* is given by (6.9.5). Jj 


The following result is an obvious consequence of the preceding one. 


6.9.12 COROLLARY. A bounded linear operator T that is the weighted sum of 
projections, 


ore) 
=; > a tae 
n= 1 


Is self-adjoint if and only if all the 2’s are real. 


If some or all of the A’s are nonreal, 7 is still normal. Recall that a normal 
Operator is one that commutes with its adjoint, that is, 77* = T*T. 


448 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


6.9.13 THEOREM. A bounded linear operator T that is the weighted sum of 
projections is normal. 


We leave the proof of this theorem to the reader. 

This is an important result for it shows that the only operators that we can 
hope to express as weighted sums of projections are normal ones. 

The bulk of the remainder of this chapter will be concerned with compact 
operators. Let us see when weighted sums of projections are compact. 


6.9.14 THEOREM. A weighted sum of projections is compact if (1) for every 
nonzero i, the range of P,,, A(P,), is finite dimensional and (ii) for every real number 
a > 0 the number of 4,’s with |1,| > « is finite. 


Proof: Suppose that 7 is a weighted sum of projections satisfying (1) and (il). 
We must show that T is compact. Recall that a compact transformation maps 
bounded sets into compact sets. Let Ty be 


N 
Ty = Ln Prns 


where {/,,...,Ay} are all the A’s such that |/,| > ¢, where ¢ > 0. By our hypotheses 
N is finite and the range of Ty, is finite dimensional. Moreover, since 


ee) 2 
IT — Ty)xl? =|] A, Pax]] < sup IAI? DPA xl? < e*[xll’, 
n=N+1 n>N+1 n 


one has ||T —Ty|| < ¢. It follows now from Theorems 5.24.3 and 5.24.8 that T is 
compact. J 


The converse of this theorem is also true. That is, if a weighted sum of projec- 
tions is compact, then (i) and (ii) follow. We will not prove this here because we plan 
to consider a more general case in a later section. 

The rather significant result that we will obtain shortly is that every compact 
normal operator is a weighted sum of projections. 


EXERCISES 


1. Let T=), A,P,, be a continuous weighted sum of projections on a Hilbert space 
H. Show that there exists an orthonormal basis of eigenvectors {x,,x,,...} for 
H. Let {14,u2,...} be the corresponding eigenvalues. 

(a) How are the p,’s related to the 1,’s? 
(b) Show that 


Tx = > Uy(X,Xq)Xq 


for all xe H. 
(c) What happens if 7 is not continuous? 


6.10. SPECTRAL PROPERTIES 449 


2. Consider the operator 


D: (X1,X2,...) 2 (OC) x, ,G(2)x3,...) 

on 1,(0,00). 

(a) Show that ® is a weighted sum of projections. 

(b) What is the spectrum of ®? 

(c) Assume that ¢(n) € 0 for all n. Show that ®~! exists and that ®~' is a 
weighted sum of projections. What is the spectrum of ®7'? 

(d) Assume further that |¢(n)| > 00 as n > o0. Show that ®~' is compact. 

(ec) Assume that (n) > Ay as n> 0, where Aj Is finite. Show that (® — 4, /) is 
compact. 


3. (Continuation of Exercise 2.) Let L=T+@=S,+ S,+@. (See Exercises 13 
and 14 of Section 6.) Assume that |(n)| ~ +00 as n— 00. Show that L~? exists 
and is compact. [Hint: Use Exercise 10, Section 7 with S=@®+ AJ, for an 
appropriate choice of 4.] 


10. SPECTRAL PROPERTIES OF COMPACT, NORMAL, AND 
SELF-ADJOINT OPERATORS 


In this section we first investigate the spectral properties of compact operators. 
Then we investigate the spectral properties of self-adjoint and normal operators. In 
the next section we will combine the results of this and the previous section to get 
the Spectral Theorem. 


A. Compact Operators 


The following theorems state the spectral properties of compact linear opera- 
tors which we will need later. 


6.10.1 THEOREM. Let T be a compact linear transformation of a Hilbert space 
H into itself and let 4 #0. Then the null space N (AI — T) is finite dimensional. 


Proof: The compact operator T maps W(AJ — T) into W(AI-— T). More- 
over, the restriction of T to W(AI — T) is AI. The restriction of a compact operator 
is a compact operator; therefore, AJ 1s compact. It follows from Theorem 5.10.7 (or 
Exercise 11, Section 5.24) that (AJ — T) is finite dimensional. J 


6.10.2 THEOREM. Let T be a compact linear transformation of a Hilbert space 
H into itself and let 1 #0. Then d is either an eigenvalue of T or d is in the resolvent 
set p(T). [That is 4 #0 is never in the continuous spectrum Co(T) or the residual 
spectrum Ro(T).| 


Proof >: Choose A with A ¢ 0. Suppose / € o(T). First let us show that A can- 
not be in the continuous spectrum. We shall do this by assuming that (AJ — T) is 


5 This proof is long and technical and the reader may wish to skip it on his first reading. 


450 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


one-to-one and then show that AJ — T is bounded below, that is, there exists a 
constant m > 0 such that ||(AJ — T)x|| > m||x|| for all x. [This shows that any time 
(AI — T) has an inverse, this inverse is continuous, and the scalar A cannot be in 
the continuous spectrum. ] 

We argue by contradiction. Suppose there is a sequence of unit vectors {x,,} such 
that ||Ax, — Tx,|| +0 as n— oo. Since T is compact, {7x,} contains a convergent 
subsequence, which we shall also denote by {7x,}. Let z = lim,..,, Tx,. 

Since 


z— Ax, = (z — Tx,) + (Tx, — AX) 
we have 


I|z = AX — |Z s Tx, 2 7X, ~ AXall- 
But both sequences on the right converge to zero. Hence 


z =lim Ax,, 


n> oo 


or, using the fact that A 4 0, one has 


1 1 
—~z=limx,. 
A er 
Since, ||x,|| = 1 we have ||z|| = |A|, thus z 4 0. Since 7 is continuous one has 


r(; 2} = lim T(x,) = Z. 
In other words, z is an eigenvector of 7. But this is a contradiction, for we have 
assumed (AJ — 7) is one-to-one. Hence we have shown that there does exist am > 0 
such that ||(AJ — 7)x|| > ml|x|| for all x, and (AJ—7)~* must be continuous. 
[Note: A # 0 was important.] 

Next let us show that / is not in the residual spectrum of 7. Recall that A is in 
the residual spectrum of Tif (AJ — T) is one-to-one and the range of (AJ — 7) is not 
dense in H. We will again argue by contradiction. We will suppose that (AJ — T) is 
one-to-one and AAI — T) 4 H. Let Xo = A, X, = (AL -—T)X0, X2, = (AI -— T)X;, 
and X,,, =(Al—T)X,. It can be seen that X¥) > X, > X, > X,>°°:. The rest of 
this proof depends on the fact that X, # X> implies that Y,,, is a proper closed 
linear subspace of X, for all. For the moment let us assume that this has been 
shown. It follows, then, that there is an x € X, such that ||x|| = 1 and x9 L X;, by 
Corollary 5.14.5. Furthermore, there is an x, € X, such that ||x,|| = 1 and x, L x. 
In fact, there is an x, € X, such that ||x,|| = 1 and x, 1 X,,, for alla. It can be 
seen that {x,} 1s an orthonormal sequence. Let n > m, then 

“(Tin — Tx,) = Xm t+ {—x, _ a aE. : ae — | 


6.10. SPECTRAL PROPERTIES 451 


But the term 
{- wa < — T)Xm = (AI — a] 


is a point in X,,,,, call it —x; therefore, 
1 
7 (Pm ='TX) = Xe =x. 


Since ||x,,|| = 1 and x,, L Xn41, one has 
|| 7% _ Tx,|| = Al, 


which shows that the sequence {7x,} cannot contain a convergent subsequence. 
This contradicts the assumption that Tis compact. Hence, AAJ — T) = Hand J is 
not in the residual spectrum of T. 

We are not finished yet with the proof. We still have to show X, # Xo implies 
that X,,, is a proper closed linear subspace of X, for all n. First, let us show that 
RAI — T) 1s closed for all A 4 0. 


6.10.3 LEMMA. The range of (AI — T) is a closed linear subspace of H for all 
A#0. 


Proof: Let {y,} be any convergent sequence in A(AI—T), and let yo = 
lim, co Yn» We want to show that yp € AAT — T). Since {y,} € AAI — T), there is at 
least one sequence {x,}in H such that (AJ — T)x, = y, for all n. Let us show that the 
sequence {x,} is bounded. Since AJ — T is continuous, its null space W(AI — T) is 
closed. Then H = W(AI—T)+ W(AI—T)*. With no loss in generality we can 
assume that {x,} is in W(AI — T)*. (Why?) Now (AJ — T) restricted to the closed 
subspace W(AJ — T)* is one-to-one. So, repeating the argument used to prove the 
first part of Theorem 6.10.2, we know that there exists a constant m > 0 such that 
(Al — T)x|| > -\|x|]| for all xe W(AI—T)*. Since {y,} is convergent, there is 
a bound M>0 such that |ly,|| <M for all n. Then M > |\(AJ — T)x,|| = m||x,ll 
or ||x,|| < M/m for all n, showing that {x,} 1s bounded. Since T is compact, {x,} 
contains a subsequence, which we denote by {x,}, such that {7x,} is convergent. 
Then 

AXn = Yn + TX. (6.10.1) 


Since {y,} converges to yy, both sequences on theright of (6.10.1) areconvergentand 
A #0, so {x,} is convergent. Let x) = lim,..,, X,. Since (AJ — 7) is continuous one 
has 


ar= 7)( lim 4} = lim (AI — T)x, 


or 
(AI — T)Xo = Yo. 


Thus yo € A(AI — T) and ACAI — T) is closed. J 


452 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


The above lemma shows that the space X, constructed in Theorem 6.10.2 ts 
closed. A slight variation on it shows that YX, is closed for all n. Thus we do not 
have to distinguish between X, and X,. 

We are now ready to finish the proof of Theorem 6.10.2. We want to show that 
X,+1 18 a proper closed linear subspace of X, for all n. We argue by induction. By 
our hypotheses, X, is a proper closed linear subspace of X). Assume that X;, 1s a 
proper closed linear subspace of X,_, for 1 <k <nand we will now show that this 
implies that X,4, iS a proper closed linear subspace of X,,. In any event, we have 
that X,,, < X,. So if X,4, is not a proper closed linear subspace of X,, wo 
have X,4, = X,. That is, AJ—7T)X, = X,. Since (AJ — T) is one-to-one, we have 
X, = (Al —T)7'X,, = X,-1 which is a contradiction. Therefore, if X,; # X), then 
X,+41 18 a proper linear subspace of X,. This completes the proof of Theorem 
6.10.2. J 


The next theorem shows that 2 = 0 is the only possible point of accumulation 
for the spectrum of a compact operator. 


6.10.4 THEOREM. Let T be a compact linear transformation of a Hilbert space 
into itself, and let « > 0. Then the number of eigenvalues 4 with |A| > « is finite. 


Proof: Weargue by contradiction. Suppose that there is an «) > 0 such that 
the number of eigenvalues 4 with |A| > a is infinite. It follows (Why?) that the 
spectrum of 7 must contain at least one nonzero point of accumulation, call it A,. 
So there must be a sequence {1,} of eigenvalues such that lim,..,, 4, = 49. Let \, 


be an eigenvector associated with A,, n= 1,2,.... The set {x,,x,,...} 1s linearly 
independent (see Exercise 3, Section 5). Let X, be the finite dimensional and, there- 
fore, closed linear subspace spanned by {x,,x,,...,x,}. We know from the Riesz 


Theorem (Theorem 5.5.4) that there is a sequence {y,} with y, € X,, |ly,|| = 1, and 
dist(y,,X,-1) 24 (v = 2,3,...). If n > m, then 


1 1 AgVicm 1 Ve Mi Vine Ay in 
An Yn qe Ym ea Ym ha a i. ) 


= Jn — 2; 


where ze X,_,. (Why?) Therefore, 


2 dist(y, 1X 1) = 3. 


1 1 
—Ty,-~—T 
| ha Vn an Ym 
But we can use the above inequality together with lim, ..,, 4, = 49 # 0 to show that 
the sequence {7y,} does not contain a convergent subsequence. This contradicts the 
fact that Tis compact. Hence, the assumption about a, leads to acontradiction. J 


The next corollary should be obvious. 


6.10.5 COROLLARY. Let T be a compact operator ona Hilbert space H. Then 
the spectrum of T is (at most) countably infinite and 4 = 0 is the only possible point of 
accumulation. 


6.10. SPECTRAL PROPERTIES 453 


As far as the point A = 0 is concerned, we cannot say too much. If 71s compact, 
4 =0can be in the resolvent set or any part of the spectrum. However, if 2 = 0 is in 
the resolvent set, H must be finite dimensional. 


B. Normal and Self-Adjoint Operators 


Now let us consider operators that are normal but not necessarily compact. 
Recall that every self-adjoint operator is normal; therefore, anything that is said 
about the class of normal operators applies also to self-adjoint operators. 


6.10.6 THEOREM. Let T be a normal transformation of a Hilbert space H into 
itself. If x € H is an eigenvector of T associated with an eigenvalue A, then x is an 
eigenvector of T*, the adjoint of T, associated with an eigenvalue 4. Furthermore, 


NI -T) = NV (AI — T*). 


Proof: From Theorem 5.23.10 we know that T is normal if and only if 
\|7x|| = ||7*x]] for all x. Moreover, if 7 is normal, then AJ — T is normal. Hence, 
(AZ — T)x|| = 0 if and only if ||(A7—7*)x|| =0. J 


6.10.7 THEOREM. Let T be a normal operator mapping a Hilbert space H into 
itself. Then the null spaces N(AI — T) and W(ul — T) are orthogonal to one another 
whenever A # LL. 


Proof: Let xe WQUI-T) and ye W(pl—T). We want to show that 
(x,y) =0. By using the last theorem and the fact that (7x,y) =(x,T*y) we get 
(Ax,y) = (x, fy) or (A — p)(x,y) = 0. Hence (x,y) =0. ff 


Recall (Corollary 5.22.5) that a closed linear subspace M reduces a bounded 
linear operator T 1f and only if M is invariant under T and 7’*. We can say more 
when T is normal. 


6.10.8 THEOREM. Let T be a normal transformation of a Hilbert space H 
into itself. Then for each complex number i the closed linear subspace N (AI — T) 
reduces T. 


Proof: Let M= W(AI — T). Since (AJ — T) is continuous, there is no ques- 
tion about M being closed. We have to show that T(M) ¢ M and T(M*)c M-. 
If A is not an eigenvalue of T, then M = {0} and M+ = H. In this case, then, the 
theorem is clearly true. Assume that / is an eigenvalue of T. Since M is the eigen- 
manifold associated with A, we immediately have that T7(/) c M. Let x e M and 
yeM-, then (x, Ty) =(T*x, y). Theorem 6.10.6 assures us that T*(M) ¢ M, 
hence we get (x,7y) = 0, for all x e M and y e M~. Continuing further, this shows 
that T(M+)- M+. § 


6.10.9 COROLLARY. Jf {M,,} is a family of eigenmanifolds of anormal operator 
T, then M=M,+M,+M3+4+-°-°: reduces T. 


Proof: From Theorem 6.10.7 we know that the M,,’s are pairwise orthogonal 
The rest of the proof should be obvious. J 


454 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 
6.10.10 THEOREM. The residual spectrum of a normal operator is empty. 


Proof: Let T be a normal operator mapping a Hilbert space H into itself. We 
have to show that if (AJ — T) is one-to-one, then the range AAJ — T) is dense in 
H. Let y be a point in H that is orthogonal to UI — T). That is, 


(Ax — Tx,y) =0 for all x in H. 
Since (x,Ay — T*y) =0 for all x in H, it follows that (AJ— 7*)y =0, that is 
ye WN (AI — T*). It now follows from Theorem 6.10.6 that y = 0. Therefore, since 
R(AI — T)* = {0} we note that R(AI — T) is dense in H, see Theorem 5.15.4(c). J 


Needless to say, it also follows that the residual spectrum of a self-adjoint 
operatof is empty. 


6.10.11 COROLLARY. A complex number 4 is in the spectrum of a normal 
operator T if and only if there exists a sequence {x,}, ||Xq|| = 1 for all n, such that 
(AI — T)x,|| 20 as n— 00. In other words, the operator (AI — T) is not bounded 
below. 


The proof of this corollary is left to the reader as an easy but not completely 
trivial exercise. (Also, see Exercise 1, Section 6.5.) 

As anyone familiar with the theory of Hermitian matrices would suspect the 
spectrum of a self-adjoint operator is confined to the real line. 


6.10.12 THEOREM. The spectrum of a self-adjoint operator T is a subset of the 
real interval [—||T ||, ||T ||]. 


Proof: Wecan use Corollary 6.10.11. Let us show that if 4 is not real, then 
there exists a constant m > 0 such that ||(AJ — T)x|| > m||x|| for all x. It will follow 
from Corollary 6.10.11 that A is in the resolvent set of 7. 

Assume that 4 = p + io, where o # 0. Then a simple calculation gives 
(Al — T)x||? = (Ax — Tx, Ax — Tx) 
= (px — Tx, px — Tx) + (iox,iox) 
> lal? ||x\l?. 
Hence AJ — T is bounded below and 4 is in the resolvent set p(7). Therefore the 
spectrum of T is real. It follows now from Theorem 6.7.4 that o(7) lies in the interval 


C-IIT|, IT]. f 
C. Compact Self-Adjoint Operators 


We turn now to a statement about the existence of eigenvalues for compact 
self-adjoint operators. Before giving this, though, let us recall that the norm of a 
self-adjoint operator T is given by 


|Z"|| = sup{|(7x,x)|: |x] = 1}. (6.10.2) 
(See Theorem 5.23.8.) 


6.10. SPECTRAL PROPERTIES 455 


6.10.13 THEOREM. Let T be a compact, self-adjoint operator on a nontrivial 
Hilbert space H. Then T has an eigenvalue 4 with |A| = ||T ||. 


Proof: It follows from (6.10.2) that there 1s a sequence {x,}in H with||x,|| = 1 
and |(7x,,,X,)| 2 ||7'||. Since J is compact we can find a subsequence of {Tx,} that 
converges in H; furthermore, since the sequence of complex numbers {(Tx, ,x,)} 
lies in a closed bounded set, we can find a subsequence of {(Tx, ,x,,)} that converges 
in the complex plane. By calling this subsequence {x,}, one then has 


(TX sX_) 2 4 and Tx, — X; 


where |A| = ||T || and x € H. 
If ||7 || = 0, the conclusion of the theorem is trivial. Assume now that T # 0, 
which implies that 2 # 0. One then has 


O < |Txq_ — Axql|? = Taal? + Axl? — A(Txq Xn) — A(T Xn Xp) 
< (Tl? + JAI?) xa]? — AD Xp Xn) — A(TXq Xn) 
= 2 Al? — A(TPx_ 5X) — A(T Xp Xp) 
Since the right side tends to 0 as n > o0, we see that 
Tx, — AX, 2 0. 


Hence 4x, > x, or x, ~ (1/A)x. Hence ||x|| = |A| 4 0. Also 


l 
r(; x] = T(lim x,) =lim Tx, = x, 
or Tx = Ax. J 


6.10.14 COROLLARY. Let T be a compact, self-adjoint operator on a Hilbert 
space H. If T has no eigenvalues, then H = {0}. 


D. |Compact Normal Operators 


We have just seen that a compact self-adjoint operator on a nontrivial Hilbert 
space has at least one eigenvalue. Our object here is to show that the same conclu- 
sion is valid for compact normal operators. 

Let T be a normal operator on a Hilbert space H. We know then (by Exercise 
13, Section 5.23) that there are commuting self-adjoint operators A and B such that 


T=A+iB and T*=A —iB. (6.10.3) 
Furthermore, one has (Exercise 14, Section 5.23) 
max(||Al, |B) <7) =| T*] and = |T I? < All? + |B’. 


We can use the Cartesian decomposition of 7 in (6.10.3) to determine whether T is 
compact. 


6.10.15 Lemma. Let T be a normal operator on a Hilbert space H and let 
T = A+iB be the Cartesian decomposition of T. Then T is compact if and only if 
both A and B are compact. Furthermore, T is compact if and only if T* is compact. 


456 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


Proof: First we note that 
Tx? = [Axl]? + || Bx|l? 
for all x in H. Indeed, since 4B = BA one has 


| Tx||? = ((A + iB)x, (A + iB)x) 

= (Ax,Ax) + (Ax,iBx) + (iBx,Ax) + ((Bx,iBx) 

= ||Ax|/? — i(Ax,Bx) + i(Bx,Ax) + || Bx||? 

= ||Ax||? — i(BAx,x) + i(ABx,x) + || Bx||? 

= ||Ax||* + || Bx|’. 
It follows, then, that a sequence {7x,} 1s a Cauchy sequence if and only if both the 
sequences {Ax,} and {Bx,} are Cauchy sequences. (Why?) Hence 7 is compact if 
and only if both A and B are compact. 


Since 7* = A — iB, it follows from the above that T is compact if and only if 
T* is compact. J 


Let us now study the relationships between the eigenvalues of A and B. 

Let A = « + if be an eigenvalue for T. We recall (Theorem 6.10.6) that J is then 
an eigenvalue of 7 *. In addition, one can show that « and f are eigenvalues of A and 
B, respectively. Indeed, if x satisfies Tx = Ax, then T*x = Ax and 


Ax =4(T + T*)x =4(A 4+ 1)x = ax, 
1 1 

Bx = —(T — T*)x = —(A—4A)x = Bx. 
2i 2i 


This also shows that 

N(M —T) = NI - T*) © N(al — A), 
and 

N(AIL—T) = NUIT — T*)S NV (pI — B). 


In order to get further information concerning the eigenvalues of T we have to 
study the relationship between the eigenspaces W(aJ — A) and W(B/ — B). For this 
purpose let us now assume that 7 is compact and normal and that « is a nonzero 
eigenvalue of A. Let xe W(al — A). Then 


(al — A)Bx = B(al — A)x = 0, 
which shows that B maps W(alI — A) into itself. That is, 
B: W(al — A) > V(al — A) 


and Bis a compact self-adjoint operator on this subspace. Furthermore, it follows 
from Theorem 6.10.1 that M(a/J — A) is finite dimensional. Therefore, we can find 
an orthonormal basis of eigenvectors {e,,e,,...,e,} of B in W(al — A) such that 
the mapping B can be represented by a diagonal matrix in terms of this basis 


6.10. SPECTRAL PROPERTIES 457 


(Theorem 6.4.4). Let {£,,f,,...,8,} be the entries in this diagonal matrix. Let us 
now show that the complex numbers 


a+ iB,,a + ip,,...,0 + iB, 


are eigenvalues for T. Indeed, let e,; be a nonzero vector in W(al — A) that satisfies 
Be, = B,e;. Hence 
= ae; + ip;e; =(a + if,)e;, 
forj=1,2,...,H. 
If we had started instead with a nonzero eigenvalue 8 of B, then one can show 
that A is a compact self-adjoint operator that maps (6/ — B) into itself. One can 


then construct an orthonormal basis for (BI — B) so that the restriction of A to 
this subspace can be expressed as a diagonal matrix 


iag(, > 5. - + 5m) 
By the same reasoning used above one can then show that the complex numbers 
Oy + iB, Xo 2 ip, 00 Om + iB 


are eigenvalues for T. 
It is easy, then, to see that if w is a nonzero eigenvalue of A, then for some B 


AN (AI — T) is nonempty, where A = a + if. (6.10.4) 
Moreover, one has 
N (AL — T) = N(al — A) O NV (BI— B). (6.10.5) 


Similarly if we start with a nonzero eigenvalue P for B, then for some « (6.10.4) 
and (6.10.5) are valid. Moreover, (6.10.4) and (6.10.5) are valid for every eigen- 
value of T. 


6.10.16 THEOREM. Let T be a compact normal operator on a nontrivial Hilbert 
space H. Then T has an eigenvalue i with 
max(||A|[, || Bll) < |Al, 


where T = A + iB is the Cartesian decomposition of T, see Figure 6.10.1. 


Proof: If T=0, then A = B=0 and A= 0 is an eigenvalue satisfying the 
conclusion of the theorem. 
Now assume that T 4 0 and say that 


|| Al] = max(|4], ||Bl|) > 0. 


Theorem 6.10.13 assures us that there is an eigenvalue « for A with the property that 
|a| = ||A||. The above discussion leads to the conclusion that there is a B such that 
A=a-+ ifisancigenvalue of T, Finally, we note that |A| > |a| = max(||All, ||Bil). fj 


458 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


Radius {All 
Radius (Al? + IlBIP )” 


Eigenvalue with 
Maximum Modulus 


Radius |IIBIl 


O = Eigenvalues of A 
X = Eigenvalues of iB 
e = Eigenvalues of T= A+iB 


Figure 6.10.1. Eigenvalues of T= A+ iB, T is Compact and Normal. 


6.10.17 COROLLARY. Let T be a compact normal operator on a Hilbert space 


H. If T has no eigenvalues then H = {0}. 


Theorem 6.10.16, then, assures us of the existence of at least one eigenvalue for 


T. If A, is the eigenvalue of T with maximum modulus, that 1s, 


|Ao| = max{|A|: A is an eigenvalue for T}, 


then one can show that |A,| = ||T'|| (See Exercise 3, Section 11.) 


And now for the Spectral Theorem. 


EXERCISES 


1. Prove Corollary 6.10.11. 
2. Show that the spectrum ofa unitary operator U lies on the unit circle {z: |z| = |}. 


3. (a) Construct a compact normal operator T= A + iB with the property that 


there is at least one eigenvalue 4 that satisfies |A|? = || A|/? + || Bll’. 

(b) Construct a compact normal operator 7 = A + iB with the property that 
every eigenvalue J satisfies |A|? < ||A||* + ||Bll?. For your example, compute 
the spectral radius r,(7) and the norm T. 

(c) Construct a compact normal (nonself-adjoint) operator T= A +iB with 
the property that every eigenvalue A satisfies |A| < max(||A||, || Bll). 


. Let A be an observable (that is, a bounded self-adjoint operator) and Iet p le 
a state with associated density operator W. Assume that the spectrum of A lies 
in the interval [a,b]. Show that £(A), the expected value of A with respect to the 
state p, satisfies 

as E(A) <b. 


6.11. THE SPECTRAL THEOREM 459 


11. THE SPECTRAL THEOREM 


The purpose of this section is to prove the following result: 


6.11.1 THEOREM. (SPECTRAL THEOREM. FIRST VERSION.) Let T be a compact 
normal operator on a Hilbert space H. Then there is a resolution of the identity {P,,} 
and a sequence of complex numbers {i,,} such that 


T=, AnPys (6.11.1) 


where the convergence in (6.11.1) is in terms of the uniform operator norm topology. 
The expression (6.11.1) 1s sometimes called the spectral decomposition of T. 


Proof: Let {A,,A,,...} denote the collection of all eigenvalues of 7. This 
collection is at most countable by Corollary 6.10.5. Let P,, be the orthogonal projec- 
tion onto M, = W(4,1—T). Since M, L M,, for n # m, it follows that P, P,, = 0 
for n # m. Let 


Q —='s y Ps . 
Then (Q is the orthogonal projection onto 
M=M,4+M,+4+-°-°:. 


We want to show that O = J, or equivalently that M+ = 0. It follows from Corol- 
lary 6.10.9 that T(M+)< M*. Let S denote the restriction of T to M+, that is, 
S: M1 — M+. Then S is compact and normal, and any eigenvalue of S is an eigen- 
value of T. However, S has no eigenvalues. Therefore, it follows from Corollary 
6.10.17 that M~* = {0}. 

We have shown that {P,,} is a resolution of the identity. Let us now show that 
T=) ,,4,P,- For this it will be convenient to order the eigenvalues so that 
|A,| = |A2| =-°-. Let : 

Se) 4, Pes 
n=1 


Since {P,,} is a resolution of the identity, one has 
H=M,4+M,4+°:°:. 


As a consequence of the Orthogonal Structure Theorem, every vector x € H can be 
written uniquely as 
xex txt HLH, 
n 


where x, € M,, and |\x||? = >, ||x,||7. It follows that 
TX 1k Ad ees =) A Xe, 


and 


io@) 


(T — Sy)x = d nxn 


460 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


Hence 


ce 


I(T — Sy)xll*7 = YL Val? len? 


n=N+1 


ioe) 
< |Ayeal? 2 Moeall® < |Ays sl? llxll*. 
=N+ 


n 


Therefore ||T — Sy|| < |Ay+,| 20 by Theorem 6.10.4. Jj 


Actually some other versions of the Spectral Theorem are more practical in 
applications. Probably the most useful is the eigenvalue-eigenvector representation. 


6.11.2 THEOREM. (SPECTRAL THEOREM. SECOND VERSION.) Let T be a compact 
normal operator on a Hilbert space H. Then there exists a (orthonormal) basis of 
eigenvectors {e,} and corresponding eigenvalues {u,} such that if x = Yin (X,en)Cq Es 
the Fourier expansion for x, then 


Tx =) U,(X,€,)en « (6.11.2) 


Proof: Weuse the notation of the last theorem. With M, = W(A,/ — T), let 
{ f,} be an orthonormal basis® for M,,. Then 


iy ieee tee Pe (6.11.3) 


Let { f} denote the union of all these { f,°}. Renumber the collection { /} to get the 
family {e,,e,,...} and let {u,,u,,...} be the corresponding eigenvalues given by 
(6.11.3). The only thing we have to prove is that the family {/}, or {e),e2,...} 1s 
basis, that is, a maximal orthonormal set, in H. 

First this family is orthonormal. That is, if 7 is fixed, then f° Lf, fori # /, 
by construction. Also, if 7 #m, then f,{” 1 f°” for any i and j, since M, 1 M,,. 
Since each vector f,*” is a unit vector we see that {e,,e,,...} isan orthonormal set, 

Next we claim that this family is maximal. Indeed if x le, for all n, then 
x Lf, for all n and k. That is, x 1 M, for all n, or x L H. Hence x = 0. 

The proof of (6.11.2) is a simple adaptation of the argument of the last 
theorem. f 


The last theorem admits another interpretation which can be viewed as thw 
third version of the Spectral Theorem. For this we shall assume that the Hilbert 
space H is separable.’ This means that the mapping U: H > /, given by 


U: Xx ((x,€1),(%,e2),- . .) 


© The subspace M, is, of course, finite dimensional when A, # 0. IfA = 0 is an cigenvalue, then (he 
corresponding null space ./(T) may be infinite dimensional. In fact, if // is not separable, then 
A = 0 is necessarily an eigenvalue and .4’(7) must have an uncountable orthonormal basis. 

7 Separability is not really necessary. It just makes things simpler. 


6.11. THE SPECTRAL THEOREM 46] 


is a unitary mapping. The operator T is then transformed into an operator A on /, 
by the equation 
T=U "AU or A=UTU", (6.11.4) 


see Figure 6.11.1. Also A is the diagonal matrix A = diag(p1,,u5,...). This repre- 
sentation A is sometimes called the “‘transfer function” of T. 

In summary, then, every compact normal operator is a compact weighted sum 
of projections in disguise. Moreover, this fact can be used to view or represent 
compact normal operators in (at least) three ways: weighted sums of projections, 
eigenvalue-eigenvector representation, unitary equivalence to operation with a 
diagonal matrix or multiplication by a transfer function. 


mi 


SS, 

Now one point must be made. The Spectral Theorem presented here is not the 
most general one possible. This should not be surprising at all, for even the weighted 
sums of projections discussed in Section 9 can be used to represent some noncom- 
pact normal operators. In fact, if we generalized from weighted sums of projections 
to ““weighted integrals of projections,’ we would be able to represent a// normal 
operators. Likewise, the ** transfer function” representation can be very successfully 
generalized (for example, Fourier transform and z-transform methods). However, a 
(ew mathematical difficulties arise here and there. On the other hand, the Eigen- 
value-Eigenvector representation really cannot be developed much further. All this, 
however, is another story, beyond the scope of this book. The only generalization 
we will present (Section 14) concerns nonnormal compact operators. 


eee 


Figure 6.11.1. 


EXAMPLE |. (THE RAYLEIGH-RITZ METHOD.) The Rayleigh-Ritz Method, 
which we now describe, is a technique for finding the eigenvalues of a compact 
tormal operator T. In this example we will assume that 7 is actually self-adjoint and 
positive, in addition to being compact. The extension of the method to arbitrary 
compact self-adjoint operators, or compact normal operators, is discussed in the 
UXCPcises. 

So then let 7: H > H be a compact, self-adjoint, positive operator on a Hilbert 
space H. Recall that the positivity means that 


(Tx,x) > 0, for all xe H. (6.11.5) 


We see then that if ju is an eigenvalue of 7, then || < ||7'|| and Equation (6.11.5) 
implies that > 0. Now Theorem 6.10.13 tells us that 


ty = (TI 


462 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


is an eigenvalue of 7. Let e, be an eigenvector of T associated with y, and also let 
M, = V(e,) be the one-dimensional linear space spanned by e,. Then M, reduces 7, 
therefore T maps M,* into M,~. Let T, denote the restriction of T to M,*. T, is, of 
course, compact and self-adjoint. So if we apply Theorem 6.10.13 again we see that 


Lz = ||T || 


is an eigenvalue of both T, and 7. This process now continues. Let e, be an eigen- 
vector associated with yw, and let M, = V(e,,e,). Then T maps M,* into M,°. 
Therefore, if we let 7, denote the restriction of T to M,*, then 


bs = ||73| 


is another eigenvalue of T. 

It can easily be seen that if we continue in this way we can then find all the 
eigenvalues of T. With these preliminaries behind us, we are now prepared to give 
the Rayleigh-Ritz Formula for the eigenvalues, which is merely successive applica- 
tions of Theorem 5.23.8, or Equation (6.10.2). 

First we note that 


uw, = sup (Tx,x). (6.11.6) 
(x,x)=1 
Next we have 
UM, = sup (Tx,x). (6.11.7) 
(x,x)=1 
(x,e1)=0 


Indeed, the condition (x,e,) = 0 in Equation (6.11.7) is precisely the condition that 
restricts J to the closed linear subspace M,*+. Hence Equation (6.11.7) also can he 
written as 

Ha = sup {(Tx,x)ix€ Mt} = |Tal. 
In general, if the eigenvalues p,, u.,..., HW, are known with corresponding eigen- 
vectors €,,@,,..., é,, then w,,, is given by 


ie sup (Tx,x). (6.11.8) 
(x,x)=1 
(x,e1) =+++=(x,en) =0 


This formula can easily be proved by a direct application of mathematical induction, 
Before the reader becomes too enamoured with this method a somewhat subtle 
limitation should be noted. Equation (6.11.8) does require that we know the cigen: 
vectors €,,..., €,, but the Rayleigh Ritz Method does not give any clue for deter 
mining these eigenvectors. 
It is possible to circumvent this deficiency by using certain approximation 
techniques. We refer the reader to the work of Aronszajn [1] for more details. J 


EXAMPLE 2. (FREDHOLM ALTERNATIVES.) Let 7’ beacompact normal operator 
ona Hilbert space H. Let y be given in H and we now seek a solution of the equation 


x=Tx+y. (6.11.9) 


6.11. THESPECTRAL THEOREM 463 


Figure 6.11.2. 


This can be viewed as a black-box problem, as shown in Figure 6.11.2. 
The Fredholm alternatives tell us precisely when it is possible to solve this 
problem. 
(a) If 1 is not an eigenvalue of T, then there is precisely one solution x for every 
y in H. The solution is of course given by 


x=(1-T)7'y. 


(See Exercise 25 for more details.) 

(b) If 1 is an eigenvalue of T, then there is a solution of (6.11.9) if and only if 
yl WUT). In this case, if x* is any solution of (6.11.9), then every other solution 
is of the form 


x=x*+c,e, ++: +¢,¢4,, (6.11.10) 


where {e,,...,€,} is an orthonormal basis for N(I —T). 

The first alternative (a) follows from the fact that if 1 is not an eigenvalue of 7, 
then | is in the resolvent set of T. 

The second alternative (b) follows from the fact that Equation (6.11.9) has a 
solution if and only if y is in the range of J — T. Since AJ — T) = W(I — T)-* we 
sce that Equation (6.11.9) has a solution if and only if y | WU — T). 

The proof of Equation (6.11.10) is left as an exercise. J 


KXERCISES 


I. Let K: L,(7) > L,(Z) be an integral operator y = Kx, where 


y(t) = { k(t,s)x(s)ds. 


Assume that J is compact and k(t,s) is continuous. Show that K is compact. 
Show that the eigenfunctions corresponding to nonzero eigenvalues can be 
chosen to be continuous. What happens to eigenfunctions corresponding to the 
eigenvalue A = 0? 


2. Use Mathematical Induction to prove Equation (6.11.8). 


464 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


3. Let T be a compact normal operator on a Hilbert space H. 
(a) Show that there is an eigenvalue A, that satisfies 


|Ao| = max{|A|: A is an eigenvalue of T}. 
(b) Show that |A,| = ||7'|]. 
(c) Show that (6.10.2) holds for this T. 


(d) Let H be a two-dimensional complex Hilbert space. Show that A, satisfies 
one of the following: 


[Re Ao] = ||All or [Im Ao] = [|B]. 
What happens if H has dimension > 3? 
4. Let T = A + iB be a compact normal operator. When is it true that 


7 ||? = |All? + B17? 


5. Let T be a compact self-adjoint operator on a Hilbert space H and also Iet 
T=) _,,4,P, be the spectral decomposition of T. Then the nonzero eigenvalues 
{A,} can be partitioned into two sets A, and A_, the positive and the negative 
eigenvalues. The operators 

Tes Ps, Totes Sie: 
Ané A+ Ane A- 
are Called the positive and negative parts of T. 
(a) Show that T=T,—-—T_. 
(b) Show that (7, x,x) > 0 and (T_ x,x) => 0 for all x in H. 
(c) Show that 7,7. =7T_T, =0. 
(d) Let |7| = 7, + 7_. Show that 7 < |T| and —T<|T|. 

6. Let T be a compact self-adjoint operator on a Hilbert space H. 

(a) Show that the positive eigenvalues of T can be found by Equations (6.11.6), 
(6.11.7), and (6.11.8). 

(b) Show that the negative eigenvalues of T can be found by replacing *‘sup™ 
by “‘inf”’ in these three equations. 


7. Use the results of Exercise 6 and Section 6.10D to discuss a method for finding 
the eigenvalues of a compact normal operator. 


8. Let L be a compact normal operator on a Hilbert space H and let 


L=¥A,P, 


be the decomposition of L as a weighted sum of projections. Assume that 
i, # 9 for all n. Show that the polar decomposition of L is given by L= RU, 
where 

R=) |A,|P, and U=) 4, |A,|~'P,- 


What happens if A =0 is an eigenvalue of L? Show that the Cartesian decompo 
sition of L is given by L = A + iB when 


A=) Re(d,)P, and B=) Im(A,)P,. 


9. 


10. 


6.11. THE SPECTRAL THEOREM 465 


Complete the proof of the Fredholm alternative (b) by verifying Equation 
(6.11.10). 


Extend the Fredholm alternatives to compact nonnormal operators by proving 

the following: 

(a) If 1 is not an eigenvalue of 7, then there is precisely one solution x of 
x —Tx = y for every y in H. 

(b) If 1 is an eigenvalue of T, then x — Tx = y has a solution if and only if 
ylwvVd-T*). 


. Let k(t,s) ¢ L,U x J) and define y = Kx by 


y(t) = f k(t,s)x(s) ds. 


Assume that k(¢t,s) = k(s,t). 

(a) Show that K is a compact self-adjoint operator on L,(/) and then show 
that ||K|| < ||All2- 

(b) Let {e,(t)} be an orthonormal basis of eigenvectors for K with associated 
eigenvalues {y,}. Assume that |u,| > |u| >-°-:. Show that 


K(t,8) =D Hn Cn( te, (5), (6.11.11) 


where the convergence above is in L,(/ x J). 
(c) Show that 


Wella = (ff Imes)? deeds) = inl 


(d) Show that ||K || = |y,|. 
(e) Characterize those operators K for which one has ||K|| = ||k||,. 


. (Continuation of Exercise 11.) Assume that J is closed and bounded and that 


k(t,s) is continuous in ¢ and s. In this exercise we will show that the series in 

Equation (6.11.11) converges to k(t,s) uniformly in t and s provided the opera- 

tor K is positive and k(t,s) is real-valued. 

(a) Show that if y, is a nonzero eigenvalue, then the associated eigenfunction 
e(t) can be chosen to be continuous and real-valued. 

(b) Let ky(t,s) = Vr, Une, (te,(s) and Ay(t,s) = k(t,s) — ky(t,s). Show that 
hy(t,t) = 0 for all ¢. 

(c) Show that there is a M such that 

: Ky(t,t) < k(t,t) << M 

for all ¢ and all N. 

(d) Show that 


2 n 
<M Y “He(t)* +0 


Y meeds) 


as m,n—oo, uniformly in s for each fixed ¢t. [That is, the convergence in 
Equation (6.11.11) is uniform in each variable separately.] 

(e) Show that the convergence in Equation (6.11.11) is pointwise. 

(f) Show that the convergence in Equation (6.11.11) is uniform in both ¢ and s. 
[Hint: Use Dini’s Theorem, from Section D.4.] 


466 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


13. 


I: 


Let T: H > H be a compact self-adjoint operator on a Hilbert space H and let 
T =), 4,P, be the spectral decomposition of T. For 1 €(— 0,00) let 
Q,=; ye Ps 


Ansa 


that is, 0,x =), <,P,x for all x € H. 
(a) Show that for each A, Q, is an orthogonal projection. 
(b) Show that 0, < Q, if A <u. 
(c) Show that 
O0O=, lim Q,, I =, lim Q,. 


A~—- © A>~+o0 


(Q, is sometimes referred to as a spectral family.) 


. Let L be a self-adjoint operator on a Hilbert space H. Let {¢,} be an orthonor- 


mal collection of eigenvectors of L and let M denote the closed linear subspace 

of H generated by {¢,}. Assume that every eigenvector of L lies in M. 

(a) Show that if M@ = H (that is, {@,} is an orthonormal basis for H), then /. 
is a weighted sum of projections. Show that o(ZL) is the closure of Po(L). 

(b) Show that if the continuous spectrum of L contains a nontrivial interval, 
then M # H, that is, {¢,} is not a basis for H. 

Consider L = S, + S, + @® on /,(0,00), where S, and S, are the right and Ieft 

shift operators, ® is the Coulomb perturbation 


] 1 
OX 3X j:535.05 08 paeae) = 26(x15%2 yen oe ae ), 


where b >0. 
(a) Show that the eigenvalues of L are 


b 27 1/2 
wefie()]  ketann 


(b) Let {¢,} be the associated eigenvector with ||¢,||, = 1. Show that {¢,} is not 
a basis for /,(0,00). [Hint: Use Exercise 14 and Exercise 14 of Section 6.| 


. Let W be a density operator, that is, W is self-adjoint with 0 < W? < Wand 


trW= 1. 

(a) Show that W is compact. 

(b) Show that one can write W=) 1, W,,, where W,, are density operators 
representing pure states and ),, 4, = 1. 

(c) Let e, be a unit vector in 2(W,,), and let A be an observable, that is, sell 
adjoint operator. Show that the expected value of A is E(A) =), A,( Ae, oy). 


. Let L and M be two compact normal operators on a Hilbert space // thut 


commute, that is, LM = ML. Show that there is a resolution of the identity 
{P.,} such that 


L=Y4,P,, M=¥u,P, 


for appropriate choice of {A,} and {1,,}. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


6.11. THE SPECTRAL THEOREM 467 


Let {L,,...,L,} be a collection of compact normal operators on a Hilbert 
space H that satisfy L;L; = L,L,; for all i, 7. Show that there is a resolution of 
the identity {P,} such that 


Lp) ADP. ee ree ce 


where {/,‘} depends on L;. 


Let {U} be a family of unitary operators ona Hilbert space of finite dimension 
n. Assume that {U} is a commutative family, that is, if U and V belong to {U}, 
then UV = VU. Show that there is an orthonormal basis of common eigen- 
vectors. [Hint: Use Mathematical Induction on the dimension n.] 


(Continuation of Exercise 19.) Let U,, —00 < t < o, be a commutative family 
of unitary operators on a finite-dimensional Hilbert space H that satisfy: 


U= dT, U,U,= U,,,, and U,x-~U,x as st 


for every x € H. Let {d,,...,0,} be an orthonormal basis of eigenvectors and 
let p,(t) satisfy 


U, x = Plt) Ox K=1,...,n. 


(a) Show that p,(t) = exp(iw, t) for appropriate choice of w,. 
(b) Show that in terms of this basis U, is the matrix operator U, = e''4, where 
A = diag{w,,...,w,,}. 


Show that the conclusions of Example 1, Section 4 can be extended to an 
infinite-dimensional Hilbert space H provided one assumed that the operator 
L is acompact self-adjoint strictly positive operator. What happens if one only 
assumes L to be compact self-adjoint and positive? 


What conclusion could one draw in Example 1, Section 4 if one assumes L to be 
compact and normal? 


Let A be a compact self-adjoint positive operator on a Hilbert space H and let 
{L,,H2,..-} be an enumeration of the eigenvalues of A, including multiplicity. 
Show that tr A =), Hy. 


Find the eigenvectors and eigenvalues for y(t) = {% ,K(t,t)x(t) dt, where 
k(t,t) = > [a, cos nt + b, sin nt], 
n=0 


where )'™. 9 (|a,{7 + |,|7) < 00. 

Consider the equation (AJ — T)x = y, where T is a compact normal operator. 

Let {e,} be an orthonormal basis of eigenvectors for T with corresponding 

eigenvalues {/,}. Assume that 1 4 0. 

(a) Show that if A is not an eigenvalue of T, then for every y in the Hilbert space 
H there is a solution x of (AJ — T)x = y and it is given by 


2 (Yen) 
0 


> o—— 
n=1A— Il, 


n° 


468 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


(b) Show that if A is an eigenvalue of 7, then (AJ — T)x = y has a solution if and 
only ify L W(AI — T). Show that ify L “(AI — T), then a solution is given 
by 


= (Ven) 
cS ~~ e,, 
» A — Hy 


where the terms involving eigenvectors in W(AIT—T) drop out since 
(y,e,) =90 for these. What is the general solution of (AJ — T)x = y when A 
is an eigenvalue of 7? 


12, FUNCTIONS OF OPERATORS (OPERATIONAL CALCULUS) 


Let T be a compact normal operator on a Hilbert space H and express 7 as a 
weighted sum of projections 


T=) 7.?, 


as indicated in the Spectral Theorem. The operator T? is also a compact operator 
and, furthermore, one has 


T= » 1,P,, ; 
To see this we note that 


Tx = T(Tx) = r(¥ 1, P, x] = i Pal 1, P, x} 
=) Andel ata 
= AP ax 


since P,,P, =0 when m £n and P,P, = P,.- 
Similarly one has 


TS) APs 
where N is any positive integer. In fact if 
N ° 
p(z) = 2% 2 
is any polynomial in z, then 
P(T) = >» PUn)Pra» 
where 


N 
PWT)=\0,7' and T°=T. 
i=0 


6.12. FUNCTIONS OF OPERATORS (OPERATIONAL CALCULUS) 469 


We also know from Lemma 6.9.11 that 


It follows, then, that if 
p(z,Z) = y cs Zz) 
i, j= 
is a polynomial in the variables z and Z, then 
p(T,T*) = 2 PA An) Pra 
where 
p(T,T*) = | Yay T'T™. 
i, j= 
We also know from Lemma 6.9.7 that the operator T is one-to-one if and only 


if A, #0 for all n. In this case 7 ~' is defined on the range A(T) and by Lemma 
6.9.10 one has 


ia »y Ay} P. (x € A(T)). (6.12.1) 
Furthermore one has 
/ ies =) A, P, (x € A(T)), 


where N is a positive integer. In general, if p(z) is a polynomial in z with no zeros 
on the spectrum of 7, then one has 


p(T) * = 2 P(A,) Pa. 


As a consequence of these observations one can easily prove the following 
theorem. 


6.12.1 THEOREM. Let T be a compact normal operator on a Hilbert space H 
and let 


T=) 1,P, 


be the decomposition of T as a weighted sum of projections. 
(a) If p(z) and q(z) are two polynomials in Z, where g(z) has no zeros on the 
spectrum o(T), and r(z) = p(z)-q(z)~*, then 
P(T)q(T)* = dry) Pra - 
(b) If p(z,Z) and q(z,Z) are two polynomials in z and Z, where q(z,Z) has no 
zeros on o(T) x o(T*), andr =pq™', then 
r(T,T*) = > r(A, An) Pa 


470 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


This operational calculus can be extended to discuss continuous (even dis- 
continuous) functions of z. That is, if f(z) is a continuous function defined on the 
spectrum o(T), then 


F(T) = SUP a 


The main problem here is defining f(T). The reader who is interested in pursuing 
this further is referred to Dunford and Schwartz [1; Section 7.3], Simmons [1], and 
Taylor [1]. 

There is one more point we would like to bring up here and that is the question 
of the square root of a positive compact self-adjoint operator 7. In this case the 
eigenvalues 1, are all real and nonnegative, and therefore, the positive square root 


./4, is well-defined. It should be clear that in this case one has 


T'2—Y) /i,P,. (6.12.2) 


EXERCISES 


1. Let A be any bounded linear operator on a Hilbert space H. 
(a) Show that the series 


A2 fo-@) n 
eA—-J[+At+—4te-= a 
2! n2o n! 


converges absolutely and represents a bounded linear operator. 
(b) Show that e4 commutes with A. 
2. Let A=), A,P,, be the spectral decomposition of a compact normal operator. 
Show that e4 =), e’"P,. What is sin A, cos A? Is it true that 
e'4 —cos A+isin A? 
3. Prove Equation (6.12.2). 


4. Let A be a bounded linear operator on a Hilbert space H and assume that a(.) 
lies in the left half of the complex plane. 
(a) Show that |lexp Ar|| ~0 as t> + oo. 
(b) Use this to show that if u is a solution of 


du _ 

dt 
then ||u(t)|| > 0 as t— co. (Show that u(t) = (exp Antu, is a solution ol 
(6.12.3) that satisfies u(0) = ug.) 


Au, (6.12.4) 


13. APPLICATIONS OF THE SPECTRAL THEOREM 


In this section we shall present a number of applications of the Spectral 
Theorem. 


6.13. APPLICATIONS OF THE SPECTRAL THEOREM 471 


EXAMPLE |. (MATCHED FILTER.) Suppose that we wish to select a linear filter 
L so that a certain signal-to-noise ratio is maximized. In particular, let us assume 
that L is to be selected from among those linear filters that can be modeled 
mathematically in the form y = Lx, or 


y(t) = fox dt,  te[0,T], 


where x is the input, y is the output, and the weighting function g is in L,[0,7]. We 
assume that we are given an input signal S(¢) and a noise random process N(w,t) 
(see Figure 6.13.1). At the final time ¢ = 7, the output is the sum of 


Linear 
Output 
L 


Figure 6.13.1. 


s= { “g(OS(T —1)dt 
0 


and the random variable 
T 
n(w) = | g(t)N(w,T — 1) dt. 
0 


Our problem is to pick g so as to maximize the signal-to-noise ratio 
[s|? 
E{|n)?}’ 
where E denotes the mathematical expectation. Therefore, we assume that 
SeéL,[0,7]. Then s can be viewed as the inner product between the points g(t) 


ind Sp(t) = S(T — t) in the Hilbert space L,[0,7]. Furthermore, let us assume that 
the noise N(q@, ¢) satisfies 


Elf atenatea)N@.T = )N(xT — 1) dey de 
0 *0 


= [ [ o(- Joe ENO.T — t,)N(w,T — 1,)} dt, dt2, 
und that the function 

W (1,02) = E{N(@,T — %2)N(@,T — 4)} 
nitisfics the condition 


a 
| { |W(t,,t2)|? dt, dt, < ©. 
QO *O0 


472 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


We then obtain 
E{\n|7} = (Wo, 9), 
where W is the linear transformation of L,[0,7] into itself defined by 


h(t.) = i W(t1,72)9(t1) dt. 


The transformation W is, of course, by assumption, known. 
We then have 
Is? g,.Se)P? 
E{|n|?} (Wg,g) 
and we now have the problem stated entirely in terms of the Hilbert space L,[0,7 ]. 


Moreover, W is a compact, positive self-adjoint transformation. Therefore it has a 
unique positive self-adjoint square root, say that W = A’. We then have 


(W9,9) a (A? 9) = (Ag, Ag) = (?,9), 


where ¢ = Ag. Let us now assume that there is a function ® in L,[0,7] with tho 
property that S, = A®, that is, Sp lies in the range of A. One then has 


Isl? _ 1g, A®)? _ (¢,0)? 
E{In?} (6.6) (68) 


Then using Schwarz’s Inequality we have 


Is? 6.) 
E{in?} (6.0) 


Moreover, the equality will be taken on if and only if ¢ = k®, where k is a nonzero 
scalar. Thus a function g that maximizes the signal-to-noise ratio exists and is given 
by k® = Ag or equivalently kSp = Wg. We can make this more explicit by using the 
Spectral Theorem. 

Since W is a compact positive self-adjoint transformation, there exists an 
orthonormal system {w,,w,w3,...} and a sequence of nonnegative real numbers 
{1,7,4.7,...} with 1,” 20 as n— 00 such that 


(6.13.1) 


< (®, ®). 


Wx = YAW) My 
Furthermore, the square root A is given by 
Ax = Y Ay(¥) Wy 
The solution of the problem kS, = Wg is then given by 
KY (Sead = YAr(GWa) My 
or equivalently 7 = 


g = d (9."n)W, =k di An “(Se Wr)Wn : 


nm 


6.13. APPLICATIONS OF THE SPECTRAL THEOREM 473 


If any of the eigenvalues {/,} are zero, then the last equation still is valid pro- 
vided the corresponding coefficient (Sp,w,) vanishes, or equivalently, provided 
Sp L W(W). However, since we have assumed Sz to belong to the range of A (and 
ipso facto to be the range of W) we see that Sp 1 W(W). (Why?) §j 


EXAMPLE 2. (KARHUNEN-LOEVE EXPANSION.) Let [a,b] be a finite interval. 
For té€ [a,b] let X(t) denote a random process with 


E{X(t)} = 0, E{|X(t)|*} < «, (6.13.2) 
and where the covariance function 
r(t,s) = E{X(t)X(s)} (6.13.3) 


is continuous (see Example 1, Section E.6). 

Let f be a complex-valued function defined on [a,b]. We shall define the random 
variable J = (7 f(t) X(t) dt as follows: Let P:a=t) <t, <+''<t,=b bea parti- 
tion of [a,b] (see Section D.2) and let |P| = max |t; — ¢;_,|. Let J(P) be the random 
variable given by 


I(P) = dA) X (Mt — t;-4). 
If it happens that E {|J(P) — I|?} ~ 0 as |P| > 0, then we shall define J as 
b 
r= { f()X@ dt. 


In the exercises the reader is asked to show that if f(t) is continuous and if the co- 
variance function r(t,s) is continuous, then the integral {2 f(t) X (t)dt exists and that 


B| [fox a =i) (6.13.4) 


Furthermore, if g is also continuous, then one can show that 
b b b 2b an tit 
Elf FOx(0 at f aX) ds} = ff FIECXOKO)} ae ds 


= { | “s(glont,s) dt ds (6.13.5) 


and 


B| | “F()X(t) dt - x6] = i "¢()r(1,8) dt. (6.13.6) 


6.13.1 THEOREM. Let X(t) be arandom process defined on a finite interval [a,b] 
satisfying (6.13.2). Assume that the covariance function r(t, s) given by (6.13.3) is 
continuous. Then one can write 


X(t)= YS Y,¢,(), a<t<b, (6.13.7) 


474 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


where {@,} is an orthonormal family of eigenfunctions of the integral operator R 
given by 


y(t) = { Hones (6.13.8) 


and moreover {,\ forms a basis for N(R)~. The random variables Y,, in (6.13.7) are 
given by Y, = (? ,(t)X(t) dt and satisfy E{Y,} =0 and E{Y, Yn) = 5im4ms where 
hm is the eigenvalue associated with @,,. Finally the series in (6.13.7) converges in the 
mean square sense to X(t), that is, 


i 


as N- © for all t in [a,b]. 


XW) - ¥ ¥ G0] | +0 


Proof: We note that the integral operator R given by (6.13.8) is compact since 


r(t,s) is Continuous, see Example 6, Section 5.24. Also r(t,s) = r(s,t), so R is sclt- 
adjoint. Let {¢,} be an orthonormal collection of eigenfunctions of R associated 
with the nonzero eigenvalues {/,}. Then @,(t) is continuous and real-valued (sce 


Exercise 12, Section 11) and the random variable Y, = |? $,(t)X(Odt exists and 
by (6.13.4) one has E{Y,} =0. Furthermore, (6.13.5) implies that 


EV, ¥u} =| [ ADda(s)r(ts) ds dt 


= [ @.00%m P(t) dt = dn Onm : 


Next let S,(t) = S'*_, Y,@,(t). Then by a straightforward application of (6.13.5) 
and (6.13.6), together with the fact that the eigenvalues of R are real, we get 


E{|X(t) — Sy(t)|?} = E{(X(t) — Sy()\(X(H) — Sy(} 
= rt) — Y An ba(Od,(0- 


It is shown in Exercise 12, Section 11 that 


r(s) = lim YA, b(t) b,(S)s 


>o n= 


therefore, we conclude that 


E{|X(t) — Sy(t)|?} +0 
asN-0. Jf 


EXAMPLE 3. (THE KARHUNEN-LOEVE EXPANSION FOR DISCRETE RANDOM PRO 
CESSES.) The expansion described in the last example is also valid when the interval 
[a,b] is replaced by a discrete countable set say f = 1, 2,... In this case, a somewhut 
different notation is customarily employed. 


6.13. APPLICATIONS OF THE SPECTRAL THEOREM 475 


Let {X,:n = 1,2,...} be a discrete random process with 
E{X,}=0 and E{|X,|*} < o. (6.13.9) 
Define the covariance matrix I = (j,,,) by 
ee ee ae, On a n,m=1,2,..., 


and assume that 
y lYaml- < 0. (6.13.10) 


6.13.2 THEOREM. Let {X,:n = 1,2,...} be a discrete random process satisfying 
(6.13.9) and assume that the covariance matrix T satisfies (6.13.10). Then one can 
write 


KAAX Xo sen} = Li tedus (6.13.11) 


where o, = {¢,,,,...} is an element of 1, and the collection of {,} is an ortho- 
normal family of eigenvectors for the matrix operator T given by y = Tx, and, more- 
over, {p,} forms a basis for N(V)*. Furthermore the random variables Y,, in (6.13.11) 
are given by Y, = > °_, bX, and satisfy E{ Y,} =0 and E{Y,Y)} = 5 4,, where 
A, is the eigenvalue associated with ,. Finally, the series in (6.13.11) converges to 
X = {X,,X2,...} in the mean-square sense, that is, 


E| || +0 


as K > o, foralln=1,2,.... 


K 
X, an 3 YO” 
k=1 


The proof of this theorem, which we shall leave as an exercise, follows the 
argument used in Theorem 6.13.1. The only noteworthy difference is to show that 
the series 


Y= Y bX, 


converges to a random variable Y,. J 


EXERCISES 


1, This exercise will lead to a proof that the integral {? f(t) X(t) dt is defined when 
fand X are continuous. We use the notation of Example 2. 
(a) Let P=a=t) <t, <°::<t,=band P =a=t) <t,;’<::: <4,’ =b be 
two partitions of [a,b]. Show that 


EU(PICP)) = VAIS RO Wh = t= G1) 


b ab 
>| [ FEOPC)RUW) dt dt’ 
as [PI], |P’| 0. 


476 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


(b) Show that E(|J(P) — I(P’)|*) > 0 as |PI, |P’| > 0. 
(c) Use the completeness of L,(Q,4,P), where (Q,¥,P) is the underlying 
probability space, to conclude that /(P) has a limit in L, as |P| > 0. 
2. Using the notation of Example 2, show that if E(X(t)) =0 for all ¢, then 
E(\? f(t) X(t) dt) = 0, when fand X are continuous. 


3. Using the notation of Example 2, show that if {| g, and X are continuous, then 


E( f rox (t) dt [ GOXE) as] = | 


a 


b 


{ “AOgr(t,8) dt ds 


E({ fox dt: x) ~ f fortes) dt. 


4. Prove Theorem 6.13.2. 


5. Let Y(t) be arandom process defined ona finite interval [a,b] with E{| Y(t)?} < 0, 
where E( Y(t)) and E( Y(t) Y(s)) are continuous functions. Show that 


Y() = E(Y(O) + Y Yn bal, 


where Y, and @, have structure similar to that defined in Theorem 6.13.1. 


6. Let x(w,t) be a complex-valued function defined for wm e[0,W] and t é [a,b] 
and satisfying: 


[, x.) dw =0, [, Hoo? do < 0, 
0 0 
for all t € [a,b]. Also assume that 

K(t,s) = { » xo, t) x(w,s) dw 


is continuous. Show that one can express x in the form 


(0,1) = Y. Y,(0) 4,00, 
where 
eee 
¥,(o) = [7 x(e,1) $,(0) at. 


7. Use Exercises 5 and 6 to study the function x(m,t) = exp (iat). 


14. NONNORMAL OPERATORS 


So far we have been concentrating on compact normal operators. But 
Suppose we have a compact operator that is not normal. What can we do? Clearly 
we cannot expect to express it as a weighted sum of projections, for all weighted 
sums of (orthogonal) projections are normal. Equivalently, we cannot expect thw 
eigenvectors to form an orthonormal set. As a matter of fact, all linear operators on 
finite-dimensional spaces are compact and it is well known that even there, the 
ones that are not normal can lead to difficulties. (The reader may be familiar with 
the Jordan canonical form.) Not too surprisingly, things can be more difficult in 


6.14. NONNORMAL OPERATORS 477 


the case of infinite-dimensional spaces. For example, one may be able to show that a 
nonnormal compact operator is similar to the operation of multiplication by a 
(transfer) function, but it is impossible for it to be unitarily equivalent to such an 
operator, for then it would be normal. In any event, we shall avoid all of these 
difficulties by taking a slightly different approach. The two main advantages of this 
approach are that (1) it is applicable to all compact operators, normal or not, and 
(2) it involves only orthonormal sets of vectors. In fact, we shall show in this section 
the every compact operator J can be represented in the form 


o.@) 
Tx = DL, Hal %n)In 9 
where the y,,s are nonnegative real numbers and {x,,} and {y,} are orthonormal sets. 


6.14.1. THEOREM. Let T be compact transformation of a Hilbert space H into 
itself. Then there exist two orthonormal systems {x,} and {y,} and a sequence of 
nonnegative real numbers {[,,[2 ,[13,...} such that 


Tx = Yl XQ) Vn (6.14.1) 


where convergence is in terms of the uniform topology, that is, ||T — Sy|| ~OasN—- ©, 
where 


N 
Sux = YU %p)Vn- (6.14.2) 
n=1 


Proof: Whether T is normal or not, the operator T*T is compact (Theorem 
5.24.7) and self-adjoint. Moreover, T*T is nonnegative, that is 


(x,T*Tx) = (Tx,Tx) = 0 


for all xe H. Therefore, it follows that the eigenvalues of 7*7 are real and non- 
negative. Let {u,7,u.7,...} denote these eigenvalues, where py, > 0 for all n. For 
convenience we assume that py, > HW, => M3, >--:. Then, using the eigenvalue-eigen- 
vector representation for 7*7, we have 


00 
T*Tx = Y, 1,0%,%,)%qs 
n=1 
where a given eigenvalue is repeated according to its multiplicity and {x,} 1s an 


orthonormal basis of eigenvectors. This operator T*7 has a unique nonnegative 
square root R given by 


Rx a 3 U(X ,Xp)Xp : 
n=1 


For yu, # 0, let 


478 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


Then 


1 TT Xm) Lm 
(TX, ee = See EE = ae 


nem nm n 


(Xn »Xmn) 


(Yn Ym) = 


so {y,} is an orthonormal system. We now extend the class {y,} to an ortho- 
normal basis in H. One then has 


TX = UnYn 


for all y,s even for p, = 0. Define Sy by (6.14.2) and let us show that ||T — Sy|| - 0 
as N- ©. 
We know that each x e H can be expressed uniquely in the form 


coe) 


NOS (X,X1)X4 eu Is (X,Xny)Xy 1 > (0X) Xa 
n=N+1 


Then 


Tx—Syx= YS (x)T = Ye 
1 n=N+1 


Since the {y,} form an orthonormal system, one has 
(06) 
| Tx _ Sixes = y Mn \(X,X_)17 < Livacg ell 
n=N+1 
Therefore ||T — Sy|| < Uy+1, 70, as asserted. Jj 


We leave it to the reader to show that the y,”’s are also the eigenvalues of T7'* 
and that the y,’s are eigenvectors of TT™*. 


Linear 
Time-Invariant 
System 
L 


v 
Output 


Figure 6.14.1. 


EXAMPLE 1. This example and the next are concerned with a classical system 
identification problem that arises in control engineering or systems engineering, 
see Truxal [1, pp. 437-438]. In simple terms, one is given a black box and the 
ability to record inputs and the corresponding outputs (See Figure 6.14.1). With 
this information one seeks a method for characterizing a mathematical model for 
the black box. Usually one can make some assumptions about the format of the 
mathematical model,-and for our purposes we assume that this is a convolution 
operator on L,[0,7], that is a linear time-invariant operator of the form 


y(t) = ox — t) dt, te [0,T], (6.14.4) 


6.14. NONNORMAL OPERATORS 479 


where® h € L,[0,7] and 0 < T < o0. However, we do not know A. Our problem is 
to determine / by running appropriate experiments with an input x and the corre- 
sponding output y. We will then try to use the ordered pair {x,y} to determine h, 
or perhaps, to estimate h. 

To begin with let us change our view of (6.14.4). Instead of it representing a 
mapping of x’s into y’s, we view it as a mapping X of h’s into y’s. Thus each input 
x yields a mapping X: L,[0,7] > L,[0,7], given by y = Xh. Then if we have chosen 
our experiment input x so that X is invertible, determination of / is, in principle at 
least, simple; indeed, A = X “'y. 

However, an interesting problem arises. First note that for any x, the operator 
X is compact. In fact, 


T TI 
[ Ix(t— a)? dt dt < a, 
0 *0O 


where x(t — t) = 0 for t > ¢. So not only is X¥ compact, it is also a Hilbert-Schmidt 
operator. In any event, using the decomposition of Theorem 6.14.1, we have that 


Xh = ¥ pal haX Vn 


where pt, ~ 0 as n > o0. (Why?) If X is one-to-one, then py, 4 0 for all n and (see 
Exercise 2) 


00 


- 1 
X= ¥ 7 Vo a% (6.14.5) 


n=1 By, 


for all y in the range of X. We see then that if Y is one-to-one (which is an a priori 
condition on the input x), then / is given by (6.14.5), where y is the corresponding 
output. 

Since in this problem one has py, -0, it immediately follows from (6.14.5) 
that X~' is not continuous no matter what experiment input x we use. This can 
be important. If for some reason (and there always is one) an error is made in 
measuring y and one has y + z instead, then 


IX "yy +z)- X09) 


can be very large even when |[z|| is small. J 


EXAMPLE 2. (CONTINUATION OF EXAMPLE 1.) One often proceeds with the 
identification of L by first performing a correlation type of operation between the 
input x and the output y; in particular, let 


P(t) = {xt — t)y(t) dt, te [0,T]. (6.14.6) 


But (6.14.6) represents the adjoint X * of X, so we have X¥*y = X*Xh. If the input 
x could be chosen so that X¥* XY = J, we would have 


h= X*y, (6.14.7) 


® Since L2[0,7] S L,[0,7], we also have & € L,[0,7]. So we are assured that L is a bounded Operator. 


480 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


However, this is impossible because X is compact. Indeed 


x*Xh = Ds bn Hh, X_)% 


and py, fi, cannot be unity for all m. On the other hand, it 1s often possible to choose 
the input x so that (6.14.7) is a suitable approximation; in particular, x can be 
chosen so that the restriction of X¥ *X to an appropriate finite-dimensional subspace 
containing hf is approximately J. See Truxal [1] for further details. J 


EXAMPLE 3. (e-CAPACITY OF A LINEAR CHANNEL.)” Consider the linear com- 
munication channel L operating as shown in Figure 6.14.2. We assume that the 


n, Noise 


Communication 
Channel 
L 


Observable 
Output 


Undisturbed 
Output 


Input 


Figure 6.14.2. 


input x, the undisturbed output y, the noise n, and the observable output z are all in 
the Hilbert space L,(/), where J = [ —7/2,T/2], T > 0, Further, L is a linear trans- 
formation of L,(Z) into itself. We will view || - ||? as the energy of the various signals. 
The only thing we know about the noise n is that ||n|| < «, where e > 0. We assume 
that the inputs x are restricted in maximum energy. That is, the allowable set of 
inputs S; is given by 

Sp = {xe L,(D): |x| < B}, 


where 0 < E< oo. 

Suppose that we want to send signals through this system in such a way that on 
the basis of an observation z one can say exactly which input x was used. If, for 
example, one of two inputs is sent, then the two possibilities are shown in Figure 
6.14.3. In (a), if a z is observed that comes from the shaded region (the intersection 


ee 


(b) 


Figure 6.14.3. 


° This example was suggested by the work of Prosser and Root [I]. 


6.14. NONNORMAL OPERATORS 48] 


of the two ¢-balls), then one cannot say whether x, or x, was sent. Whereas, in 
(b) one can say unambiguously on the basis of z which input was used. In general, 
given a maximum input energy FE? and a noise level e, one would like to choose a set 
of inputs (that is, messages) M ¢ S; such that ||L(x,) — L(x,)|| = 28, x,, x, € M, and 
Xq # Xg. Since it is desirable to have as many inputs (messages) available as possible, 
one is interested in how large M can be. Since L,(/) is separable, we can immediately 
say that M is at most countably infinite. (Why ?) However, for most practical com- 
munication channels L is compact, so we can say that M is usually finite. (Why ?) In 
any event, if L is compact and M(e,£) is the maximum number of messages that 
can be unambiguous in the above sense, then 


C(é,E) = log, N(¢,£) 


is referred to as the e-capacity of the channel. 
Suppose the communication channel LZ can be modeled mathematically by the 
integral operator 


(Lx)(t) = { h(t —1)x(t)dt, tel, (6.14.8) 

I 
where the weighting or unit impulse response function / is defined on (— 00,00) and 
{ |h(t)|2 dt < ©, (6.14.9) 


that is, he L,(— 00,0). It follows that (6.14.8) represents a continuous linear trans- 
formation of L,(/) into itself. Moreover, since we are considering a finite time inter- 
val, we have 


[ [ina - oP drat <a, (6.14.10) 
pa) 


that is, L is compact. 
Since L is compact, we have 


Lx =} ta%n)Yns 


where we assume that yp, > yu, > py; >°°:, and the image of the closed ball of 
radius E is a compact ellipsoid & whose semi-axes are py, Ey,,n = 1,2,.... N(¢,£) 
is, then, the maximum number of pairwise disjoint open eé-balls that can be fitted 
into &. Prosser and Root [1] have shown that 


n(2e) /y) n(e/V¥2) 79. [9 .E 
(4) < N(g,E)< |] Ge ), 


i=1 26 i=1 é 


where n(c) denotes the number of p,’s such that yp, > c. They have also investigated 
the behavior of the e-capacity of (6.14.8) as T—> oo and e > 0 and given estimates of 
this e-capacity in terms of the Fourier transform of A(t). § 


482 ANALYSIS OF LINEAR OPERATORS (COMPACT CASE) 


EXERCISES 


1. How is the decomposition of T, as given in this section, related to the decomposi- 
tion in the Spectral Theorem when T is compact and normal? 


2. If Tis compact and one-to-one show that the inverse of T defined on its range can 
be represented by 


7 a 
T ly= Y —(,y)Xn> 
n=1 Uy 


where the domain of JT ~ ? is all y for which the above series converges. 


3. Let T be a compact (not necessarily normal) operator. Show that the transforma- 
tion (AI — T) can be represented by 


(AI — T)x = 0 6,(x,x,)¥n, for all x EH, 
n=1 


where {x,,X,,...} and {y,,y2,...} are orthonormal systems and the o,’s aro 
nonnegative real numbers. [Hint: Show that 


(AI — T*)(Al “ve T)x = » (ae we Mn X,Xn)Xn 5 


for all x e H, where |A|? — ,, > 0. Now use the argument of Theorem 6.14.1.] 


4. Discuss Theorem 6.14.1 for the case of the Volterra integral operator y = Kx, 
where 


y(t) = | RG ea) de 
on L,(a,b) with 
{ f lect) dt dt < w. 


5. Define k(t,t) by 


ee 2t—-t—1l, O<t<t<l, 
Co ae ae O0<t<t<l. 


Use Theorem 6.14.1 to analyze the operator 
1 
y(t) = | k(t,t)x(t) dt. 
0) 


[See Exercise 1(f), Section 7.2.] 


6.14. NONNORMAL OPERATORS 483 


SUGGESTED REFERENCES 


Aronszajn [1] Lorch [1] 

Ash [1] Paley and Wiener [1] 
Bachman and Narici [1] Prosser and Root [1] 
Bochner [1] Riesz and Sz. Nagy [1] 
Edwards [2] Stone [1] 

Gikhman and Skorokhod [1] Truxal [1] 

Hilbert [1] Zygmund [1] 


Also see references at the end of Chapter 5. 


Analysis 

of 
Unbounded 
Operators 


SO Se eS 


Introduction 
Green’s Functions 
Symmetric Operators 


Examples of Symmetric Operators 


Sturm-Liouville Operators 
Garding’s Inequality 
Elliptic Partial Differential 
Operators 

The Dirichlet Problem 


The Heat Equation and Wave 
Equation 


. Self-Adjoint Operators 

. The Cayley Transform 

. Quantum Mechanics, Revisited 

. Heisenberg Uncertainty Theorem 
. The Harmonic Oscillator 


486 
488 
493 
495 
498 
S05 


510 
516 


523 
527 
533 
539 
54] 
543 


1. INTRODUCTION 


In the last chapter we discussed the concept of weighted sums of projections. 
We saw that every compact normal operator can be expressed as a weighted sum 
of projections. Compact operators are, of course, bounded operators. 

In this chapter we turn our attention to unbounded operators. We are interested 
in knowing when an unbounded linear operator can be represented as a weighted 
sum of projections. You will recall (Exercise 1, Section 6.9) that if Z is a bounded 
linear operator that can be represented as a weighted sum of projections, then there 
exists an orthonormal basis of eigenvectors {x,x,,...} for L, with corresponding 
eigenvalues {u,p3,...}, such that 


x= Yn (Xi Xn) Xp 
and 


Lx = yn Un(X,Xn) Xp : 


The same phenomenon applies to certain unbounded linear operators (see Exercise 
1 in this section). 

To repeat, we wish to know when an unbounded operator can be represented 
as a weighted sum of projections. A sufficient condition, which will handle all the 
situations we shall be interested in, is given in Theorem 7.1.1 below. However, before 
formulating this theorem, it is necessary to introduce a technical condition con- 
cerning unbounded linear operators. 

We shall say that a linear operator L is an operator on a Hilbert space H if both 
the domain @, and the range #, lie in H and QJ, is dense in H. One sometimes says 
that L is densely defined. In the remainder of this chapter we shall consider only 
operators that are densely defined. 

As we shall see, this restriction to densely defined operators is not serious. The 
following test will suffice in many cases. Assume that H is the Hilbert space 
L,(R"). It is shown in Appendix D that the space of infinitely differentiable functions 
with compact support C)~(R") is dense in L,(R"). It follows then that if L is an 
operator with L: 9, > L,(R") and C,°(R") < G,, then L is densely defined. More 
generally, if Q is an open set in R" and H = L,(Q), then the space C,)°(Q) Is dense 
in L,(Q) (Exercise 4, Section D.12). Thus any linear operator L: 9, > L,(Q) with 
Co? (Q) < G, is densely defined. 


7.1.1 THeoreM. Let L bea linear operator on a Hilbert space H. If there ix a 
complex number Ao in the resolvent set of L for which (AgI— L)~! is compact and 
normal,’ then L can be expressed as a weighted sum of projections. 


1 This is a slight abuse of terminology since the range of (Ag / — L) may be only dense in // and not 
all of H. Therefore (Aj J — L)~' would be defined only on a dense subset of #/. But if (Ag / 7)! 
is compact it is also continuous. Therefore, it has a unique extension to // which we also denote by 
(Ag 1—L)~'. The hypotheses then ask that this extension be normal. (It is automatically compact ) 


486 


7.1. INTRODUCTION 487 


Proof: Since (Ag I — L)~' is compact and normal, there is a weighted sum of 
projections }),, 1, P, with 
Cry ee L)"! a >» Dak a 


by the first version of the Spectral Theorem. By taking the inverse, we get 
Aol — L — yn Ayo Pex. 

or 
L= nr Go _ Ay VPs 


where the last equality holds on 9,. J 


The hypothesis of the last theorem is so important that it warrants a special 
definition. 


7.1.2 DEFINITION. Let L be a linear operator on a Hilbert space H. We shall 
say that L has a compact-normal resolvent if there is a J) in the resolvent set of L 
for which (A) J — L)~' is compact and normal. 


While the last theorem gives a sufficient” condition that an operator can be 
represented as a weighted sum of projections, it is only the starting point for our 
analysis. The problem we wish to solve is the following: 

Given the operator L, how can we determine whether it has a compact-normal 
resolvent? More specifically, what conditions on L itself will guarantee that L has a 
compact-normal resolvent? 

Our problem, then, is twofold. Given a complex number Ap in the resolvent set 
of L, 


(1) when is (Aj J — L)~* normal, or self-adjoint, and 
(2) when is (A, J — L)~' compact? 


The rest of the chapter is concerned with the study of these questions. 


EXERCISES 


1. Let L be a linear operator ona Hilbert space H with a compact-normal resolvent. 
Show that there exists an orthonormal basis {x,,x,,...} in A and a sequence 
of complex numbers {,,u5,...} such that 
(a) Lx, = UaXn- 

(b) Lx =), u,(x,x,)x, for every vector x in the domain of L. 
(c) Show that the domain of L is contained in the linear subspace 


{x € Hs Din [Ma(X,Xn)|? < 00}. 
(d) Show that L is bounded if and only if the sequence {u,,u,,...} 1s bounded. 


2 It is not a necessary condition for the simple reason that many noncompact opcrators can be 
expressed as weighted sums of projections. 


488 ANALYSIS OF UNBOUNDED OPERATORS 


2. Let L bea linear operator on a Hilbert space H with a compact-normal resolvent. 
(a) Show that for every / in the resolvent set of L the operator (AJ — L)~? is 
compact and normal. 
(b) Show that L has at most a countable number of points in its spectrum. 


2. GREEN’S FUNCTIONS 


For a certain class of operators, especially certain differential operators, the 
inverse operator can be represented as an integral operator. For these operators we 
have available a test for a compact-normal resolvent. To be specific assume that // 
is the Hilbert space L,(/) and 


(AoE — L) u(x) = | a(x.y)u(y) dy 


for all win H. Then the kernel g(x,y) is said to be the Green’s function for 1) I — L, 
or (ApJ —L)™'. 
Recall (Example 6, Section 5.24) that if 


J J laeey)P dx dy <, (7.2.1) 


then (A) J — L)~* is compact. In particular, if Jis a compact set and if g is continuous 
or bounded, then we see that (A, J — L)~* is compact. Also recall that if 


g(x,y) = g(x), (7.2.2) 


then (A, J — L)~' is self-adjoint.? Thus, if (7.2.1) and (7.2.2) are both satisfied, then 
L has a compact-normal resolvent. Let us now look at some examples.* 


EXAMPLE 1. Consider the Hilbert space L,[0,1] and let Lu = —u", where 
the domain JY, consists of all functions u in L,[0,1] such that uv’ is absolutely con- 
tinuous, v”eL,[0,1], and u(0) = u(1) =0. Let us show that L~! is an integral 
operator given by 


1 
(> v)(t) = | g(t,t)v(t) dr, (7.2.3) 
0 
where 
(4-0, 0<t<t<I, 
Kt) -{i —i)t, O<t<tK<l. 


If we can verify that (7.2.3) holds for this g(t,t), then it follows that L~* is compact, 
since g(t,t) is bounded. Also, L~! is self-adjoint since g(t,s) satisfies (7.2.2). 


3 We could replace (7.2.2) with the corresponding property for normality. We will not do this here 
since all of our applications deal with the self-adjoint case. 

*In our examples we will not explain how one can derive the given Grcen’s functions. If one tn 
interested in this question, an excellent discussion can be found in Courant and Hilbert [1, Vol. t, 
pp. 351-388]. As we shall ultimately see (see Section 5) the actual derivation of the Green's function 
is not important from our point of view. Most of the information we seck concerning linear Gillet 
ential operators can be derived without the use of the Green's function. 


7.2. GREEN’S FUNCTIONS 489 


Now let v € L,[0,1] and define u by 


u(t) = | Peer ore { ores, | ees | ae 
e) 0 t @] 


It is easy to see that u(O) = u(1) = 0. Since 
1 1 
u'(t) = | v(t) dt — i w(t)dt and u(t) = —v(1), 
t 8) 


we see that ue 9, and Lu = v. Since L is one-to-one we see that L~‘ is given by 
(7.2.3). Jj 


The last example is an example of a Sturm-Liouville operator. These operators 
will be studied in greater generality in Section 5. 

In the next example we shall consider a second-order differential operator on 
an infinite interval. 


EXAMPLE 2. Consider the operator 
Lu =u" —(1 + x”)u 


on L,(— 00,00). We shall let the domain J, be the collection of all C?-functions 
u(x) with Lue L,(— 00,00) and u(x) > 0 as x > +00. It is shown in Exercise 7 that 
L is one-to-one. 

We claim that the inverse is given by 


(L“'v)(x) = J g(xy)o(y) dy, (7.2.4)$ 
where 
2 2 x ro8) 
x '/? exp [2 { e- dt | edt, x<y, 
i : 
g(x,y) = ce 2 
-1/2 eee 2 12 ~ 12 
Tt exp| = | fe dt if edt, y<x. 


Since g(x,y) = g(y,x) we see that the integral operator given by (7.2.4) is self- 
adjoint. It is possible to show directly with appropriate estimates that 


i [ lg(x,y)|? dx dy < 00. (7.2.5) 


We will not do this here. However, in Exercise 7, Section 14, we shall actually 
compute the integral in (7.2.5) by means of the Spectral Theorem. 


® The representation in (7.2.4) is strictly speaking an extension of L~' to all of L,(— 0,00). 


490 ANALYSIS OF UNBOUNDED OPERATORS 


Through a straightforward but somewhat laborious computation one can 
show that if v is continuous and wu is determined by 


u(x) = Me g(x,y)v(y) ay, 
then Lu =v. Furthermore, if v¢L,(— 00,0), then u(x)->0 as x > too. Thus 
L~' is given by (7.2.4). J 
EXAMPLE 3. Consider the Laplacian operator 


0*u—s- 0*u 
A? TAD 


on L,(D), where D is the unit disk in R’, 
D = {(x,y): x7 + y? < 1}. 


Assume that the domain 2, of A consists of all C?-functions u(x,y) with the 
property that Au e L,(D) and u = 0 on @D, the boundary of D. We claim that the 
inverse A~'* is given by 


(A o)(xay) = J aCxays Emo(Esn) dé dn, (7.2.6) 
where 
1 
g(x,y; 6.4) = “52 08 = ’ (7.2.7) 
1 0103 


a, is the distance between the points P;_, and P;, where P, = (0,0), P, = (é,y), 
P, = (x,y), and 


P, = (.-— sl 
3 ae eat cl 
Refer to Figure 7.2.1. 


We shall not prove here that the integral operator given by (7.2.6) does indeed 
represent A~'. We shall leave that as an exercise. One point should be made though. 


Figure 7.2.1. Unit Disk in R?, 


7.2. GREEN’S FUNCTIONS 491 


The kernel g is not continuous in D; with (x,y) held fixed, g has singularities at 
(€,n) = (x,y) and (€,7) = (0,0). However, when v is continuous in the interior of 
D, the integral in (7.2.6) is well defined (see Exercise 3). J 


EXERCISES 


1. In the following problems you are given a second-order differential operator 
L, an interval J, and boundary conditions for restricting the domain of L as was 
done in Examples | and 2. You are also given a function g(t,t) and you are 
asked to show that it is a Green’s function for L. 

(a) Tu=u", I=([0,1), 
u(0) = u’(1) = 0, 

tf, O<t<t<l 

t Os 77 = 1; 

(b) lu=u", T=[-1,1], 

u(—1) = u(1) = 0, 
g(t,t) = —4(|t — tT] + tt -— 1). 
(c) Lu =u", I= [0,1], 
u(0) = u(1), u’'(O) = u'(I), 
et+el-e™' O<t<t<l, 
(1 = e)g(t,t) -{5 elt ect. O<1<t<l. 
(4) Zduy=uO+u@, T=([0, 1], 
u(1) = 0, u(0) finite, 
_f{-logt, O<rt<t<1l 
Ht) =|"10 t, O<t<t<l. 
(e) (Lu)(x) = tu’(t) + u’(t) — A/2 + t/4)u(t), 
J =(0,+00), u(0) is finite, 


g(t,t) = 


u(t) >Oast> +0, 


+ t\ peer" 

exp(— J — dr, O<t<t, 
g(t,t) = ad Z 
15 a as 

exp 5 )f : dr, O0<tX<t. 


t 


(f) Tlu= ju", I=([0,1], 
u(0) + u(1) = 0, u’(0) =u, 
_f2x-y-1, O<y<x<l, 
ale.y) =[ O<sxsysl. 


[Nore: In this case 1.7! is not self-adjoint. Is Lo! 


normal ?] 


492 ANALYSIS OF UNBOUNDED OPERATORS 


2. Verify Equation (7.2.5). 


3. (a) Let r= (€* + ?)!”? and D be the unit disk in R*. Show that the following 
integrals are finite: 


{ log r dé dn, | [log r]? dé dn. 
D D 


[Hint: Change to polar coordinates. ] 

(b) Use the last result to show that if v is continuous inside D, then the integral 
in (7.2.6) is well defined. 

(c) Show that 


[| laces Eml 2dé dy dx dy < co, 


where g is given by (7.2.7). 
(d) Show that g(x,y; €,y) = g(€,n; x,y), where g is given by (7.2.7). 


4. Show that if v is continuous in the interior of D and ve L,(D) and if uw is 
defined by 


uy) = [ aleys EndolEn) aé dn, 


where g is given by (7.2.7), then Au = v in D, and u =0 on OD. 
5. Consider the Laplacian operator 


on L,(D), where D is the unit ball in R* and x = (x,,x2,x3) € R°. Show that 
the Green’s function for A, is given by 


(x6) = 7 — - : | 


tld, 6,63 


where o; is the distance between the points P;_, and P; and Py = (0,0,0), 
Py = (61562563), Po =(%1,%2,%3), and Ps; = (€,/lEl7,E2/1E17,65/1€17), where 
[E> = €,7 + &,? + &3”. 

6. Show that the last exercise can be extended to the Laplacian operator 


n 07u 
A,u = — 
‘ 2 ax,” 
on L,(D), where D is the unit ball in R”, n > 3. In this case, the Green’s function 
becomes 
1 1 1 
x, ee el eee ee ee LS 
a8) (n — 2)t,~ | bes : a," *03" | 


where oa; is defined as in Exercise 5 and t,_, is the surface area of the (7 - 1) 
dimensional unit sphere in R". 


7.3. SYMMETRIC OPERATORS 493 


7. Show that the operator Lu = u" — (1 + x”)u given in Example 2 is one-to-one. 
[Hint: Let u(x) = ) 9 a,x" be a solution of Lu = 0. Show that no solution of 
this form lies in L,(— 00,00) with u(x) > 0 as x > +00.] 


3. SYMMETRIC OPERATORS 


In this section we turn our attention to the first problem posed in the intro- 
ductory section of this chapter. This is, let L be a linear operator on a Hilbert space 
Hf and let A, be a complex number from the resolvent set of L. We now seek con- 
ditions on L itself in order that (A, J — L)~' be self-adjoint.°® 

It seems natural to ask that the operator L satisfy the relationship 


(Lx,y) = (x,Ly) (7.3.1) 
for all x and y in the domain @,. The last equation is very important, and we 
Shall say that a linear operator L is symmetric if (7.3.1) holds for all x and y in 
9,. Some authors say that a linear operator L is formally self-adjoint if (7.3.1) 
holds. We prefer to use the term “‘symmetric”’ here and reserve the use of “self- 
adjoint”’ for a more specialized concept which will be defined in Section 10. 

One point should be emphasized before we proceed and that is that the concept 
of symmetry for an operator L depends on the domain of definition 9, for L. 
This is an important observation because with many of the differential operators 
we shall study there is wide latitude in the choice of the domain of definition. For 
these operators, the domain of definition is oftentimes prescribed by means of 
certain boundary conditions and, consequently, these boundary conditions play an 
important role in determining whether the operator is symmetric or not. 

We will consider some examples of symmetric linear operators shortly. However, 
before doing this it is useful to make note of the following results concerning the 
eigenvectors of a symmetric operator. 


7.3.1 LEMMA. Let 4 be an eigenvalue for a symmetric linear operator L 
defined on a Hilbert space H. Then d is real. 


Proof: Let x #0 be an eigenvector associated with 4. By using (7.3.1) one 
then has 
(A — A)(x,x) = A(x,x) — A(x,x) = (Ax,x) — (x,Ax) 
= (Lx,x) — (x,Lx) = 0. 


Since (x,x) #0, one has A = J, that is, A is real. Jj 


7.3.2 LEMMA. Let L be a symmetric linear operator defined on a Hilbert space 
H and let x, and xz be eigenvectors of L with associated eigenvalues i, and i. If 
A, #42, thenx, 1 x3. 


® Please note that we are not seeking conditions on L in order that (Ag — L)~! be normal. We 
restrict out attention to self-adjoint operators because the examples of interest fall into this cate- 
gory. A discussion of normality for general (unbounded) linear operators can be found in Riesz 
und Sz. Nagy [l; p. 349] and Stone (1; pp. 311-331). 


494 ANALYSIS OF UNBOUNDED OPERATORS 


Proof: Since Lx; = 4;x;, i= 1,2, it follows from (7.3.1) that 


(Ay — An)(X1,%2) = Ay (% 1X2) — A2(%1,X2) = (Ay X1,X2) — (%1,12X2) 
= (Lx1,X2) — (%,Lx2) = 0. 


Since 1, #A,, one has (x,,x.) =0,orx, lx,. J 


Thus the geometry of the eigenvectors for unbounded symmetric operators 
is similar to that for bounded self-adjoint operators. A very interesting, although 
not surprising, situation arises when a collection of eigenvectors for a symmetric 
operator forms an orthonormal basis for the Hilbert space. 


7.3.3 THEOREM. Let L be a symmetric linear operator on a Hilbert space 
H. Let {x,} be an orthonormal collection of eigenvectors for L with associated 
eigenvalues {u,,}. If {x,} is an orthonormal basis for H, then for every x € D, one has 


Lx = LOon (X,X,)Xn) ad ye L(X,Xn)Xn 7 3 9) 4 
= Son (6 )L%y = Do Hal %)% ene 

Proof: The proof of this theorem is very simple. It is based on the Fourier 
Series Theorem. If x e H, then x =), (x,x,)x,, since {x,} is an orthonormal basin 
for H. Similarly, if x € D,, then 


Lx = De (LX,X1,)Xp = oe (x,Lx,)Xp, 
= > (X;HnXn) Xn a Dis Un(X,Xp)Xn 
= > (x,x,)LX, : E 


Even though the last theorem is a relatively simple result, it is nevertheless i 
very practical result. The reason for this is that for certain linear operators (especiully 
for differential operators arising in boundary value problems) it is relatively cusy 
to determine the collection of all eigenvectors. The only hypothesis of Theorem 
7.3.3 that causes some concern in applications is showing that a given orthonormal 
collection of eigenvectors forms an orthonormal basis for the Hilbert space. In 
order to solve this problem one can either use the techniques of Section 5.18 or one 
can use some of the results presented in Section 5. 


EXERCISES 


1. Show that the operators defined in Exercises I(a), (b), (c), (d), and (e), Section 2 
are symmetric. 


2. Show that the operator defined in Exercise 1 (f), Section 2 is not symmetric. 


7 Equation (7.3.2) holds for all bounded operators L. What is important here is that L may lw 
unbounded, 


7.4. EXAMPLES OF SYMMETRIC OPERATORS 495 


3. Consider the operator Lu = u” on L,[0,1] with boundary condition u’(0) = 0. 
(a) Show that every real number J is an eigenvalue for L. 
(b) Does Chave other eigenvalues? 
(c) Show that L is an extension of the operator 2L, where L is defined in 
Exercise 1 (f), Section 2. 
(d) Is Lsymmetric? 


4, Find the eigenvalues and eigenvectors for the operators defined in Exercise 1, 
Section 2. 


5. Find the eigenvalues and eigenvectors for the operator defined in Example 1, 
Section 2. 


6. Find the eigenvalues and eigenvectors for the Laplacian operator defined in 
Example 3, Section 2. [Hint: Express the Laplacian operator in polar coordinates 
(r,@) and separate the variables. ] 


4. EXAMPLES OF SYMMETRIC OPERATORS 
The following operators are important in quantum mechanics. 


EXAMPLE 1. (THE POSITION OPERATOR Q.) Consider the Hilbert space 
L,(— 00,00). Let 


Do = te € L(— 00,00): [tor dx < | 


and define 0: Dg > L,(— 0,0) by 

(Qu)(x) = xu(x). (7.4.1) 
Since Cy”(—00,00)¢ Dg, we see that Q is densely defined. Furthermore, if 
u,v, € Do, then 


(Qu,v) = [ xu(x)v(x) dx = [ u(x)xv(x) dx =(u,Qv). 


Thus Q is symmetric. 
We will show in the exercises that the spectrum of the operator Q is only a 
continuous spectrum and that it is precisely the real line R. Jf 


EXAMPLE 2. (THE POSITION OPERATOR Q,.) Consider now the Hilbert space 
LCR"), where x = (x,,...,X,) € R". For k = 1, 2,..., n let 


Do, = u(x) e L,(R"): J oP lu@oP dx < o}, 


where dx = dx, ... dx,, and define Q, by 
Oy: U(X 5.66 Xpq) > XpUC%y,---Xn)s (7.4.2) 


where u(x) € Yo,. It is easy to see that Q, is a symmetric operator. Also, it is easy 
(o show that the spectrum of Q,, which is a continuous spectrum only, consists of 
the real line R. §j 


496 ANALYSIS OF UNBOUNDED OPERATORS 


EXAMPLE 3. (POTENTIAL OPERATOR V(Q).) Fork =1,...,n, let QO, be given 
by the last example. Let V(x,,...,x,) = V(x) be a real-valued, continuous function 
defined for x = (x,,...,x,) € R” and define V(Q) = V(Q,,...,Q,) by 


V(Q): u(x) > V(x)u(x). (7.4.3) 
If we define the domain of V(Q) by 
Dy gy = {u(x) € L2(R"): V(x)u(x) € L2(R")}, 


then it is easy to see that V(Q) 1s symmetric. 

It is not necessary to assume that V(x) is continuous. The function may even 
have poles in R”. An example, which arises in some applications in mechanics, 
occurs when n = 3], 

Fox = DXap—27 + X3n-17 + Xe, l<k<l, 
rin = L(%3j;-2 - X3n-2) + (X3;-1 — X3,-1) + (x3; — Xa) 1”, 
then 
C. 
V(x) a >. . 
j#k Vix 


where Cj, are real numbers. The corresponding operator V(Q), in this case, in 
called a Coulomb potential. This type of potential arises in quantum mechanics 
when one is studying the interaction between / electrons and a nucleus. J 


EXAMPLE 4. (THE MOMENTUM OPERATOR P.) Consider the Hilbert space 
L,(— 00,00). Define P: Dp > L3(— 0,0) by 
P: u(x) > —iu'(»), (7.4.4) 


where the domain Q> is the space Cy'(— 00,00), the space of all C*-functions with 
compact support. Since Cy'(— 00,00) is dense in L,(— 00,00) we see that /’ In 
densely defined. Furthermore, if u, v€ Dp, then by integrating by parts we get 


(Pu,v) = — Wo ae 


—i lim (u(R)v(R) — u(—R)v(—R)) — { * Gta ae 


= (u,Pv). 


Hence P is symmetric. The spectrum of this operator is analyzed in Exercise 5. Jj 


EXAMPLE 5. (THE MOMENTUM OPERATOR P,.) Consider now the Hilbert spine 
L,(R"). For k = 1, 2,..., define P, by 


0 
Pit u(X 45.66 3X,) > = oy, Ueno: (7.4.4) 


7.4. EXAMPLES OF SYMMETRIC OPERATORS 497 


where the domain of P, is C)~(R"). By integrating by parts one can easily show that 
P, is symmetric. 
The operator P,’ is also of interest. This is, of course, the operator 
2 


4) 
Py? t U(X p50 + sXpq) > Fe U(X 5. Xq)s (7.4.6) 
OX, 


which is also symmetric on C)”(R"). fj 


EXAMPLE 6. (THE LAPLACIAN OPERATOR A,.) The Laplacian operator A, can 


be represented in terms of the momenta operators P,, k = 1, 2,...,n, 
n 
A, = >. PY, 
k=1 

which means that 

67 u 67u 

A,u = +: ; 
Ox,” Ox,” 


for ue Cy ~(R’). 
In order to show that A, is symmetric we use Green’s formula (see Taylor [1, 
pp. 458-459]) 


[ (uA,o — 5A,u) dx = | (ue 3) ds, 
Q 6Q 


where Q is a bounded region in R", é/dv denotes the outward-normal-derivative and 
dQ is the boundary of 2. Now let u, ve Cy”(R") and choose © large enough to 
contain the support of both u and v. One then has u=v=0 on 0Q and it follows 
from Green’s formula that 


(A, u,v) — (u,A, 0) = { [(A,«)6 — u(A,o)] dx 


= | [(A,u)o — u(A,o)] dx = 0. 


Hence A, is symmetric. Jj 


EXERCISES 


|. (a) Show that the operator Q in (7.4.1) has no point spectrum. 
(b) Show that every 4 € R is in the continuous spectrum of Q. 
(c) Show that every nonreal complex number is in the resolvent set of Q. 
(Compare this with Exercise 7, Section 6.6.) 


2. Let f: R- C be a continuous function and define f(Q) by 
S(Q): u(x) + f(x)u), 
where the domain of f(Q) is 
Y = {u(x) € L,(— 0,00): f(x)u(x) € L,(— «,00)}. 


498 ANALYSIS OF UNBOUNDED OPERATORS 


(a) Show that f(Q) is symmetric when f(x) is real valued. 

(b) Show that the spectrum of f/(Q) is f(R). 

(c) Show that A is an eigenvalue of f(Q) if and only if the set f~'({A}) has 
positive measure. 

(d) Let Ag be an eigenvalue of f(Q). Show that the null space /(/(Q) — Ao!) 
is infinite dimensional. 


3. (a) Show that the operator Q, given by (7.4.2) is symmetric. 
(b) Show that the spectrum of Q is the real line R and that this is a continuous 
spectrum. 


4. (a) Show that the operator V(Q) given by (7.4.3) is symmetric when V(x) is a 
real-valued continuous function. 
(b) Show that a Coulomb potential V(Q) is a symmetric operator on an 
appropriate domain of definition. 


5. Analyze the spectrum of 
(a) The momentum operator P. 
(b) The momentum operator P,. 
[Hint: Use Example 6, Section 6.6.] 


5. STURM-LIOUVILLE OPERATORS 


The operators we shall study in this section arise in the context of second-order 
ordinary linear differential equations of the form 


—(pu')y + qu = kv, 
or 
(pu')' + (Ak — q)ju =0, 


subject to appropriate boundary conditions. Operators of this type are rather 
common in applications. They arise in the study of vibrating strings, vibrating 
membrane, resonance in a cavity, transmission lines, wave guides as well as optinul 
control problems. We begin with the definition of a (nonsingular) Sturm-Liouville 
operator. 


7.5.1 DEFINITION. Let p(x), p’(x), g(x), and w(x) be continuous real-valued 
functions on the finite interval a < x < b, where a < b, and assume that p(x) - 0) 
and w(x) > 0 fora<x <b. Let H be the complex Hilbert space 


[ux [ooo dx < co, 


where the inner product on H is given by 


(u,v) = | Coe dx. (7.41) 


7.5. STURM-LIOUVILLE OPERATORS 499 


[The term w(x) in (7.5.1) is called a weight function.] Now let 2 denote the collection 
of all C?-functions u(x) in H satisfying the boundary conditions: 


Bu = Bau = 0, (7.5.2) 
where 
B,u = B,u(a) + y,u'(a), 
1 B,u(a) + ( ) (7.5.3) 
Biu = B,u(b) + y2u'(d), 
B,, Bo, Yi. 2, are real constants with 
[Bil + ly | > 9, [Bal + ly2| > 9. (7.5.4) 
Under these conditions the operator 
1 
Lu = _ [—(pu’)’ + qu] (7.5.5) 


is said to be a Sturm-Liouville operator if D = 9,. Since Cyo*[a,b]< Q, we see that 
L is densely defined. 


7.5.2 LEMMA. Let L be a Sturm-Liowville operator. Then for every real number 
4, the operator L — AI is a Sturm-Liouville operator with Diy- 47, = DL. 


Proof: Since 
| 
(L— A1Ru= me [—(pu’y + (¢q —A@)u]. 
the conclusion is obvious. j 


7.5.3. THEOREM. Let L be a Sturm-Liouville operator. Then L is symmetric. 
Proof: Let L be given by (7.5.5) and let u and v be in Y. Then 


(Lu,v) — (u, Lv) = { —(pu')'t + qui + (po’)'u — quo} dx 


b 
= | {—(pu’)'t + (po’)'u} dx. 
By integrating by parts we get 


(Lu,v) — (v,Lu) = p(b)[u(b)v’(b) — u'(b)v(b))] — P@Lu(@v'(@) — w'(@v(@]. (7.5.6) 
Since B; and jy; in (7.5.3) are real, we see that u and @ also lie in M. If we apply 
B, to u and v, we get 

B,u(a) + y,u"(a) = 90, 


B,o(a) + y,5'(a) = 0. (7.5.7) 


500 ANALYSIS OF UNBOUNDED OPERATORS 


It follows from (7.5.4) that at least one of the terms f,, y,; is nonzero. Hence the 
determinant of the system (7.5.7) must vanish, that is, 
ua) wal? _ 9 
w(a) vay 

If we would apply B, to u and i, we would get a system of equations similar to 
(7.5.7). From this we would conclude, in an analogous manner, that 
u(b) u'(b)° _ 
(6) (6) 
If the last two equalities are inserted in (7.5.6) we see that (Lu,v) = (u,Lv). 
Hence L is symmetric. Jf 


Carefully note that the symmetry of L in the last theorem is determined by the 
form of (7.5.5) along with the boundary conditions (7.5.2). 

Let us now study the eigenvalues of the Sturm-Liouville operator L given by 
(7.5.5) and (7.5.2). The equation Lu = Au reduces to 

pe pee ay (7.5.8) 
p p 

Let u,(x,A) and u,(x,A) be a fundamental system for (7.5.8), that 1s, u,(x,A) and 
u,(x,A) are two linearly independent solutions of (7.5.8). We claim that / is an 
eigenvalue for L if and only if 
A(A) = Byuy, Byu, 


= ( 
Bau, Bau,| 7° (7.5.9) 


Indeed, every solution of (7.5.8) can be written as u = c,u, + c,U,, where c, and 
Cc, are complex numbers. The boundary conditions (7.5.2) then become 


Byu = C,Byu, + c,B,u, = 0, 


7.5.10 
Bou — c, Bou, + c,B,u, — 0. ( 


If A is an eigenvalue, then there is a nontrivial solution (c,,c,) of (7.5.10); hence 
the determinant A(A) must vanish. Conversely, if the determinant A(A) does vanish, 
then we can find a nontrivial solution (c,,c,) of (7.5.10). This then defines an 
eigenfunction u = c,#, + CU, for the eigenvalue J. 

If 0 is not an eigenvalue of L, then L is one-to-one and the inverse L~! in 
defined on the range @,. Let us now show that L~’ can be represented as an 
integral operator where the Green’s function is continuous. In the proof of the 
next theorem we shall give a method for constructing the Green’s function of any 
Sturm-Liouville operator that does not have 0 as an eigenvalue. 


7.5.4 THEOREM. Let L be a Sturm-Liouville operator given by (7.5.5) and 
(7.5.2) and assume that 0 is not an eigenvalue of L. Then there is a continuous function 
g(x,y) defined for a< x,y < b such that 


b 
(L'v)(x) = | g(x,y)o(ya(y) dy 


7.5. STURM-LIOUVILLE OPERATORS 501 


for all v in the range &,,. Furthermore, g(x,y) can be chosen so that it is real-valued 
and g(x,y) = g(y,x). Consequently, L~* is a compact, self-adjoint® operator. 


Proof: For j =1, 2 let u,(x) be a nontrivial real solution of the boundary- 
value problem: 
pu" + pu’ —qu=O), Bju; = 0. 


Since 0 is not an eigenvalue of L, it follows from the consideration above that 
u,(x) and u,(x) are linearly independent, and B,u,; #0 for i # j. Let 


Uy(x) ua (x) 
Uy'(x) u2'(x) 


be the Wronskian for these two solutions. Then the general solution of Lu = v or 


W(x) = 


—(pu')' + qu = wv 
becomes 
U(X) = C,Uy(X) + C2 U(x) + 2(x), 
where 
ee i U,(X)u2(y) — u2(x)u,(y) 
0) = Towa) 


We want to choose the coefficients so that when v is continuous then wu is in J, that 
is, u satisfies the boundary conditions (7.5.2). If we set c, = 0, and 


v(y)w(y) dy. 


c= —{ 2 yoy) ay, 


a p(a)W(a) 


then the solution u(x) becomes 


_ ph esdualy) Fur) 
u(x) = — fay COW) ay = | Ty we) dy 


b 
= | 9(x,yo(y)o(y) dy, 


where g(x,y) satisfies 


_{-u(xu,(Q), asx<y<b, 
Ae | —u,(x)uy(y), asy<x<b. 


It is easy to verify that this function wu satisfies the boundary conditions (7.5.2). 
Furthermore the kernel g(x,y) is evidently continuous. 

Since u, and uw, are real-valued functions we see that g(x,y) is real-valued and 
g(x,y) = g(y,x). Thus, as noted in Section 2, the operator L~' is compact and 
self-adjoint. 


® See footnote concerning Theorem 7.1.1. 


502 ANALYSIS OF UNBOUNDED OPERATORS 


What happens if 0 is an eigenvalue of L? Then, of course, L does not have an 
inverse. However, if A is a real number and is not an eigenvalue of L, then 
(AI — L)~? exists. If we now combine Lemma 7.5.2 and the last theorem we get the 
next result. 


7.5.5 COROLLARY. Let L be a Sturm-Liouville operator given by (7.5.5) and 
assume that a real number A is not an eigenvalue of L. Then there is a Green’s function 
g(x,y,A), defined for a < x, y < b, continuous in x and y and such that 


(AL — L)~'o)(x) = | 9(x,yA)o(yo(y) dy 


for all v in the range for 4I — L. Moreover the operator (AI — L)~' is compact and 
self-adjoint. 


Let L be a given Sturm-Liouville operator. With the above corollary in mind 
we pose the question: “ Does there exist a real number A with the property that A 
is not an eigenvalue of L?’’ The answer is definitely yes. 


7.5.6 THEOREM. Let L be a Sturm-Liouville operator. Then L has at most a 
countable number of eigenvalues, all of which are real. 


Proof: It follows from Lemma 7.3.2 that the eigenvectors of L, associated 
with distinct eigenvalues, are mutually orthogonal. Since the Hilbert space // in 
separable (Example 12, Section 3.12) there are at most a countable number ol 
nonzero mutually orthogonal vectors. Hence L has at most a countable number ol 
eigenvalues. fj 


In the exercises you will be asked to show that the eigenvalues of L ure 
actually bounded below. 
We can now State our main result. 


7.5.7 THEOREM. Let L be a Sturm-Liouville operator. Then there exists 
sequence of real numbers {[,,}12,...} and an orthonormal basis {¢,,¢,,...} of ll 
such that each @, is a C?-function and Lo, = UnQrn- 

Moreover, each real number pt appears at most twice in the sequence {1,,[>,...). 
Furthermore, each function u in H can be written in terms of the Fourier series 


A COLE 
fue D,, then 
» Musbn) Hn” < 90 (7.5.11) 
n=1 


and 


Lu = Y baltidn)r- 


7.5. STURM-LIOUVILLE OPERATORS 503 


Proof: Choose a real number / so that A is not an eigenvalue of L. The 
operator (AJ — L)~' is then compact and self-adjoint by Corollary 7.5.5. Now 
apply the Spectral Theorem to (AJ—L)™! and let {(i,,fi,,...} be the eigen- 
values of (AJ— L)~' with corresponding eigenfunctions {¢,,@,,...}. It follows 


from Theorem 7.1.1, that if 


LM, = A — a. 


then {u,,u,,...} are the eigenvalues of L with corresponding eigenfunctions 
{P1,h2).--}- 

Each ¢, is a C?-function since it is the solution of a second-order differential 
equation (7.5.8) with continuous coefficients. Also, each eigenvalue appears at 
most twice since Equation (7.5.8) has at most two linearly independent solutions 
for each J. 

The characterization of the domain Q, in terms of (7.5.11) we shall leave as 
an exercise. ff 


A word on methodology seems warranted. In this section we constructed a 
Green’s function for the inverse of a Sturm-Liouville operator, and you will be 
asked to work out some specific examples in the exercises. In practice one seldom 
seeks the eigenvalues and eigenfunctions in this fashion. It is almost always easier 
to work directly with the differential equation and boundary conditions. 


EXERCISES 


1. In the following problems you are given a second-order differential operator L, 
an interval J, and boundary conditions so that L becomes a Sturm-Liouville 
operator. You are asked to find the collection of eigenfunctions and associate 
eigenvalues for L. 


(a) Lu=u', I= [0,1], 


u(0) = u’(1) = 0. 
(b) Lu=u’, I=[0,1], 
u(O) = u(1) = 0. 


(c) Lu= tu’, I=(0,/], 

u(0) = u(l) = 0 (t > 0). 
(d) Lu=u’, T=[-1,1], 

u(—1) = u(1) = 0. 

(e—) Lu=u', T= [0,1], 

u’(0) = u'(1) = 0. 

2. Extend the theory of Sturm-Liouville operators to operators of the form (7.5.5) 

with the more general boundary conditions, namely, 


Byu ae Bou = 0, 


504 ANALYSIS OF UNBOUNDED OPERATORS 


where 

Byu = B,,u(a) + B,2u'(a) + By 3u(b) + B,4u'(d) 
and 

Bau = B2,u(a) + B22u'(a) + B23u(b) + B24u'(), 
where the f;,’s are real, 


rank (pit Biz Bit Bit) = 2, 


and 


P(@)[B13B24 — Bi4B23) = P(6)LB11B22 — B12B21). 
(See Hellwig [1, pp. 39-47] and Coddington and Levinson [1, Chaps. 7, 11].) 


3. This exercise follows the format of Exercise 1, but uses the conclusion of 
Exercise 2. 


(a) Lu=u', I= [0,1], 
u(O) = u(1), u'(0) = u’(1). 
(b) Lu=u" —4, I =([0,1], 
u(O) — u’(0) + u(1) = 0, 
u(O) + u’(O) + 2u’(1) = 0. 


4. If the term p(x) in (7.5.5) vanishes at one of the endpoints or if the interval 
[a,b] becomes infinite, then the Sturm-Liouville operator is said to be singular 
For some of these singular operators the theory is similar to that of nonsingulit 
Sturm-Liouville operators. Many of the classical functions arise in this way. We 
follow the format of Exercise 1. For these problems we consider the Hilbett 
space L,(/) with the inner product given by (u,v) = |, udw dx, where w > 0 ina 
weight function. 

(a) (Legendre Polynomials): Lu = [(x* — 1)u’]’, 7 =[—1,1], w(x) = 1, u(1), and 
u(—1) are finite. 

(b) (Tchebychev Polynomials): Lu = (1 — x?)u” — xuw’,J=[—1,1], w(x) - 
(1 — x*)~'/?, uw (1) and u(—1) are finite. 

(c) (Laguerre Polynomials): Lu = xu" + (1 — x)u’, 1 =[0,00), w(x) =e ‘ul 
are finite. 


5. The violin operator is defined to be the Sturm-Liouville operator 


Lu = — = u” 
p 
on 0 < x </ with u(0) = u(/) = 0. For this operator t denotes the tension in the 
violin string, p the density of the string, and / the length of the string. I 
A, <4, < ++: denote the eigenvalues of L. 


7.6. GARDING’S INEQUALITY 505 


(a) Show that J, =n’a, for n =1,2,... and an appropriate choice of «. 

(b) The first eigenvalue A, is defined to be the pitch of the violin operator. Show 
that one can increase the pitch by either increasing the tension, decreasing 
the length, or decreasing the density. 


6. Other problems can sometimes be reduced to Sturm-Liouville problems, as we 
now illustrate. Consider the Brownian motion X(t). That is, X(t) is a real sto- 
chastic process defined for 0 < ¢ < 1 and satisfying: 


X(0) = 0; 
E(X(t)) = 0, forO0<t<1; 
E(|X(0)|’) = ¢?, for0<t<1; 
k(t,s) = E(X(t)X(s)) = min (t,s). 


(a) Show that the eigenvalues of the integral operator K with kernel k(¢,s) are 
solutions of 


Ag(t) = ed@ded | ore 
where $(0) = 0. 


(b) Show by taking two derivatives this reduces to the Sturm-Liouville problem 
gp" 4. 1 p = 0 
y| a 9 


o0)=0, (I) =9. 


(c) Find the eigenvalues and corresponding eigenvectors. 
(d) For an interpretation see the Karhunen-Loeve expansion (Theorem 6.13.1). 


6. GARDING’S INEQUALITY 


There is another—almost direct—way of showing that a linear operator L has 
a compact resolvent and that is by showing that L satisfies a Garding inequality. 
In order to explain this we need some additional concepts. 


7.6.1 DEFINITION. Let X, be a linear subspace in a Banach space (X%,|| - ||.) 
and assume that ||- ||, isa norm on X,. We shall say that the space (Xj,||°||,) is 
compact in (X, ||° ||2) if the unit ball in X, 


B, = {xe X,: ||xl], < 1} 


lies in a compact subset of (X%, || ||,). The following characterization of this 
concept is a simple consequence of the Compactness Theorem (Theorem 3.17.13). 


506 ANALYSIS OF UNBOUNDED OPERATORS 


7.6.2 THEOREM. Let X, be a linear subspace of a Banach space (X», ||-||,) 
and let ||-||, be a norm on X,. Then the following statements are equivalent: 


(a) (X41, Il * ll) is compact in (X2,|\ ° |l2). 

(b) Every subset of X, that is bounded in the ||: ||,-norm is relatively compact 
in the ||: ||,-norm. 

(c) Let {x,} be any sequence in X, with ||x,||, <b < 0 for all n. Then there 
is a subsequence of {x,} that converges in (X,,]||°||,). 


(d) Every subset of X, that is bounded in the ||+|\,-norm is totally bounded tn 
the || - ||,-norm. 


EXAMPLE 1.7 The Sobolev spaces Hy"(Q), for n=0,1, ..., are defined in 
Example 12, Section 5.13. Recall that H)"(Q) is the completion of C,°(Q) with 
respect to the norm ||- ||,, generated by the inner product 


(u,v), = fe gee dx. 


It is easy to see that the spaces are decreasing with a, that is, 
ing ey." cH,’ ¢ eee c H,' eH,” = oe 
Furthermore one can prove the following result. 
7.6.3 THEOREM. (RELLICH’S THEOREM.) Let k and I be nonnegative inteqvi 


with 1>k, and let Q be an open bounded domain in R™. Then (Ho'(Q), || |\,) «9 
compact in (Ho‘(Q),|\ - |lx). 


Proof: We shall prove this for the case / = 1 and k = 0. The other cases wie 
treated in the exercises. 

Before proceeding with the proof it will be helpful if the reader reviewed (lv. 
results of Exercise 29, Section 3.17, and Exercise 16, Section 5.6, as well as Exatnyrh 
4, Section 3.17. Let us make note of some of the properties of the mollifier operite 
defined in Exercise 16, Section 5.6. First recall that 


J, LQ) > La(Q) 


is a bounded linear operator. This means that there 1s a constant K such that 
1/2 
sup|J,u(x)| < K({ [u(x)|* ax) (7.04) 
xeQ Q 
for all ue L,(Q). Furthermore, if ue Co®(Q), then 
D*(J,u) = (—1)'*,(D*u). 


° This is a rather long and technical example which may be skipped on the first reading. Howeves 
it is important for later examples. 


7.6. GARDING’S INEQUALITY 507 


Hence there is a constant K,,,, depending on |a|, such that 
1/2 
sup|D*(J,.u(x))| < Kia(| |D%u(x)|? ax} (7.6.2) 
xEeQ Q 


If ve C*(Q), then the gradient of v is the vector 


and 


Now by applying (7.6.2) we get 
- sup|VJ,o(x)/? < ¥ sup|D*J,00? < K,? ¥ | |D*v(x)|? dx, 
xeQ jaJ=1 xe jaj=1°Q2 


or in terms of the norm ||- ||, we get 


Supa v(x)| < Ky lols. (7.6.3) 

Now let A = {u} be a bounded set in (H,'(Q),|| - ||,). Since Cy”(Q) is dense in 
(Ho'(Q), || - |,), for every 1 > 0 we can find a set A, = {v} in Co®(Q) such that for 
each u in A there is a v in A, with ||\u — oll, <7. It is easy to see that for each 
n, A, is a bounded set in (Ho'(Q),|| - ||). If we can show that for every n > 0, the 
set A, is totally bounded in (Hy°(Q),|| - |lo), it will follow that A is totally bounded 
in (H_°(Q), ||: |lo), by Exercise 29, Section 3.17. 

Now fix 1 > 0 and consider A,, which can be described as a collection {v} in 
Cy”(Q) with the property that there is a B with |lv||, < B for all ve A,. We will 
now show that for every e > 0 the set J,(A,) is totally bounded in L, = H,°. We 
do this by showing that the set of functions J,(A,) is pointwise compact and equi- 
continuous, see Example 4, Section 3.17. 


If ve A,, then it follows from (7.6.1) that 
[J.v(x)| < Kllvllo < Klloll, < KB. 


Thus J,(A,) is pointwise compact. Similarly, by the Mean Value Theorem’? 
one has 


[Jv(x) — J.vy)| = [VU .r) (x — YI, 


where the : is the usual dot product and the gradient V(J,v) is evaluated at some 
point on the line segment connecting x and y. By applying the Schwarz Inequality 
to this dot product and by using (7.6.3) we get 


|J,v(x) — J,v(y)| s Sup IM eM ‘|x — yl S K,ylloll lx — yl, 


' Sce Taylor (1, p. 224]. 


508 ANALYSIS OF UNBOUNDED OPERATORS 


where |x — yl? =)", |x; — y,|7. Since lvl], < B we see that J,(A,) is equicontin- 
uous. Finally we note that (see Exercise 16, Section 5.6) 


IJ,0(x) ~ o(x)| = | J PCN = 28) — 060) ae] 
2 f p(E)Vo - e ae| 
|é|<1 


<e{ —_p(é)|Vul lE| dé 
le] <1 
< sup|Vv|-¢ < K |lvl|,e < KBe. 
xeQ 
Thus 


1/2 
ft} J, v(x) — v(x)? as| < KB|Q|"/2¢, 
Q 


where |Q| denotes the Lebesgue measure of 9. The result for the case / = 1 and 
k = 0 now follows from Exercise 16, Section 5.6. Jj 


Now let (X,,|| + ||,) be compact in (X,,|| - ||,) and let T be a linear operator on 
X,. Assume that the domain Z(T) lies in X,. We shall say that T satisfies a Garding 
Inequality if there is a constant a > 0 such that 


allxl, <\|Txl,, x¢Q(T). (7.6.4) 


If we consider T as a mapping from (Z(7),||- ||,) to (AZ), || - ||), then Equation 
(7.6.4) says that T is bounded below. Therefore the inverse 


T's (&T),| + le) > (GC), I) 
exists and is bounded, in particular, 
IT *yll, <a™"|lyll2, ye A(T). 


However, because of the compactness property of (X,,||°||,) and (X,,||-|],), 7° ' 


is also a compact operator since it maps the unit ball 
B, NAT) = (ye AT): lvlla < 1} 


into a bounded set in (X,,||- ||,). This in turn implies (Theorem 5.24.2) that if we 
consider the norm ||: ||, on A(T), then 


T~*:(A(T),|I * 2) = (GM) Ila) 


is compact. Let us summarize this. 


7.6.4 THEOREM. Let (Xj,||-||,) be compact in (X,,||° |2), where (X,,|]+l],) 4 
a Banach space and let T be a linear operator on X 1. Assume that the domain (ACV) 


7.6. GARDING’S INEQUALITY 509 


lies in X,. If T satisfies a Garding Inequality (7.6.4), then T~* exists and is a compact 
operator. Furthermore, there are constants A, B such that 


IT7~"yll; SAllll2, IT 'ylla S Bllyll2 


for all ye RL). If, in addition A(T) is dense in X,, then T~* has a unique continuous 
extension to all of X,. 


The fact that T~* can be extended to all of X,, when A(T) is dense in X, isa 
standard result for linear operators; see Exercise 3, Section 5.6. 

There is a particular form of the Garding Inequality which we shall encounter 
in our study of symmetric differential operators. In order to explain this, let L be 
a symmetric operator on a Hilbert space AH, with inner product ( , ). Let 
H, be a linear subspace of H, and let( , )o and ||: ||) be another inner product 
and associated norm on H,. Assume that the domain Q(L) lies in Ho and that 
(H || " Ilo) is compact in (Hi, || : I). 

Now define B[u,v] by 

(u,Lv) = B[u,v]. 


We note that BLu,v] is a sesquilinear functional defined for ue H, vé DL). Also, 
since L is symmetric, then B[u,u] is real-valued. (Why ?) 
Now assume that there are real constants c and k with c > 0 and such that 


Blu,u] = cllullo — kllull, ue DL). (7.6.5) 
Equation (7.6.5) can be rewritten as 
(u,Lu + ku) > ellull?. 
By using the Schwarz Inequality for (-, -) we get 
cllullo” < ICL + Dull - lal. 


However, since (H9,||°|l9) is compact in (H,||- ||) there 1s (by Exercise 3 in this 
section) a constant b > 0 such that 


lel] < Bljullo. 
By applying the last inequality to (7.6.6) we get 
cb™*|lullo < |(L + kDull, 


that is, (L + kJ) satisfies a Garding Inequality. For this reason Equation (7.6.5) is 
also referred to as a Garding Inequality. 
Let us summarize these observations. 


7.6.5 COROLLARY. Let L be asymmetric operator on a Hilbert space H with 
inner product ( , ). Let ( , )o be an inner product on QL) and assume that 
(A(L),||* \|o) be compact in (H,|\- ||). If there are real constants c and k with c>0 
and such that 

Bl uu] = cljullo? — kilull’, (7.6.7) 


then (L. + kI) satisfies a Garding Inequality and (L + kI)~' is a compact operator. 


510 ANALYSIS OF UNBOUNDED OPERATORS 


EXERCISES 


1. 


Prove Theorem 7.6.2. [Hint: Recall the equivalent versions of compactness in 
Section 3.17.] 


. Let X¥, = C[0,1] and X¥, = C’[0,1] be given, where 


lx, = sup{|x()|:0 <t < 1}, 
and 
xl], = Ixia + x'lle- 


Show. that (X,,||°||,) is compact in (X,,||°||,). [Hint: Apply the Arzela-Ascoll 
Theorem. ] 


. Let (X,,|| ° ||) be compact in (X3,]|| - ||2). Show that there 1s a b > 0 such that 


Ix ll,Sbllxll,, x EX. 


[Hint: Show that the identity mapping J: X, — X, is compact and therefore 
continuous. |] 


. (a) Prove Theorem 7.6.3 for the case] =k + 1, k > 1. [Hint: Use mathematical 


induction. ] 
(b) Prove Theorem 7.6.3 for the general case, / > k. 


. Show that Theorem 7.6.3 can be extended to the Sobolev spaces H'(Q). 


. The Hélder space C*[0,1] with norm || - ||, are defined in Example 10, Section 5.3, 


Show that if 0 <a< B <1, then (C*[0,1], || - ||,) is compact in (C7[0,1], || |l,). 


. Let X, be a linear subspace of a Banach space (X2,||°||2), and consider the 


given norm ||: ||, on X,. Show that (X,,||-||,) 1s compact in (X.,||- ||.) if and 
only if X, is finite dimensional. 


. Show that Corollary 7.6.5 can be generalized to a nonsymmetric operator /. 


when (7.6.7) is replaced by 
Re Blu,u] = cllullo” — kul’. 


ELLIPTIC PARTIAL DIFFERENTIAL OPERATORS 


In the next section we shall study the Dirichlet problem, which is a boundary- 


value problem for certain elliptic partial differential operators. Our objective in thin 
section is to show that, under appropriate conditions, these operators satisl'y i 
Garding Inequality. We shall use the notation of Example 12, Section 5.3. 


Consider the differential operator of order 2m, 


Lu = > (—1)!*!D*(a*°(x)D*u), (7.7.1) 
Os fal, 


Blsm 


7.7. ELLIPTIC PARTIAL DIFFERENTIAL OPERATORS 511 


where the coefficients a** are assumed to be sufficiently differentiable in a region 
Q in R", say that a*’ belongs to C”. If wu and v belong to Cy”"(Q), then by integrating 
by parts we get 


(v,Lu) = { vLu dx = y= (-1)'"! vD*(a**D®u) dx 
Q O<lal,|Bl<m Q 
= YF (-syl(ayiets 4! fea D*vyii dx 
O<sla|,[Bl<m Q 
= (L*v,u), 
where"? 
v= Y  (-1)!D%(a%*(x)Dv). 


O<|a|, [|B] <m 
If L = L*, then L is symmetric on the domain Cy?"(Q) < L,(Q). For example, 
if L consists only of higher-order terms 


Lu= YY (-1)"D%a"D*u), 
[a|=|s|=m 


then L is symmetric whenever a” = a®*. (Compare this with Example 1, Scction 
5.23.) In particular, the Laplacian operator 


n o2 
A=) — 
2, ae 
and the fourth-order biharmonic operator 
n Qt 
> eer 
oy On 


are symmetric operators on Cy*(Q) and Co*(Q), respectively. 
For any differential operator L given by Equation (7.7.1), we define the 
sesquilinear functional B[u,v] by 


Blu,v] = (u,Lv) = (L*u,v), (7.7.2) 


where u, v € Cy””(Q). Let us note here that if we integrate by parts B[u,v] takes on 
the form 
Blu,v] = Y (D%u,a**D*v), (7.7.3) 
O<|a],[s] sm 
where (-, -) denotes the usual inner product on L,(Q) and u, v € Co”"(Q). Because 
of Exercise 3, Section 5.13, Blu,v] is well defined for all u, ve H)™(Q). 

Our objective now is to show that B satisfies a Garding Inequality of the form 
(7.6.5). In order to simplify our discussion, we shall assume henceforth that the 
coefficients a** are constant. A more general discussion, which includes the case of 
bounded continuous coefficients, can be found in Friedman [2, pp. 32-37]. 


'! For now we use L* only as a notational convenience. In Section 10 we shall define the adjoint for 
certain unbounded operators. At that time L* will take on an added significance. 


512 ANALYSIS OF UNBOUNDED OPERATORS 
Our main hypothesisconcern the highest-order coefficients a’ with |a| =|B| = m. 


7.7.1 DEFINITION. Let L bea differential operator given by (7.7.1), where the 
coefficients a** are constant. We shall say that L is strongly elliptic if there is a 
positive constant cy such that 


(-1yrRe( > atteret) > coll, FERN (7.7.4) 


la|=[B|=m 


where &* = €,71---&% and |é|? = €,2 + +--+ + &. In this notation the expression 
|€|?" can be written as 


es aie — cme) ll (7.7.5) 


|a|=[B| =m 


where 67" is the Kronecker delta function. 


The Laplacian operator A furnishes examples of strongly elliptic operators. 
One sees easily that (—1)"A” is strongly elliptic where m = 1, 2,... and 


A™u = A(A™~*u) 
is defined by induction. 
Before stating the main result, let us recall the notation for the norm ||u|I,,, 
where 


1/2 
lathe =(¥ [weal ax) 
jal[<m°Q 


7.7.2 THEOREM. Let L begiven by(7.7.1), where the coefficients a*® are constant, 
and assume that L is strongly elliptic. Let B be given by (7.7.2). Then there are real 
numbers c, and ky with c, > 0 and such that 


Re BLu,u] = Cy iene ~ Kollullo? (7.7.6) 
for all ue Co” (Q). 


Before proving Inequality (7.7.6) let us note that if L is symmetric on Cy)?"(Q), 
then 
Blu,u] = (u,Lu) 


is real-valued for all win Cy?"(Q). (Why ?) So in this case, Inequality (7.7.6) becomes 
the Garding Inequality (7.6.5). 

The proof of this theorem relies heavily on the Fourier transform (see Example 
13, Section 5.22). Recall that if U = ¥u is the Fourier transform of u given by 


U(é) = | e~'™ Su(x) dx 
and V = ¥v, where u, v € Co*(Q) < Co”(R"), then 
F (D*u) = (i0)*U() (7.7.7) 


7.7. ELLIPTIC PARTIAL DIFFERENTIAL OPERATORS 5]3 


and 
| U@VO dé = ny | u(xo@) ax. (7.7.8) 
We will also need the following two lemmas, which will be proved in the exercises. 
7.7.3. LEMMA. Let Q be a bounded domain in R". Then there exists an & > 0 
such that for any ¢ with O < & < &9 there is a constant c, such that 
ell S Ellell a” + Cgllello~ (7.7.9) 
for allue Cy™(Q). 


The important thing to note in that last lemma is that ¢) and c, do not depend 
on u. 


7.7.4 LEMMA. Let QO be a bounded domain in R" and assume that « and B 
satisfy |x| < m, |B| <m— 1. Then there is a constant K such that 


< K|lUllmn lM lms 


{ D*uD®u dx i D®uD*u dx 
Q Q 
for all u in Cy”(Q). 

Again it is important to note that the constant K is independent of u. 


Proof of Theorem 7.7.2: Let B= B+ R, where B is given by (7.7.3) and 


Blue] = YY (Du,a*D*v), 
la|=[Bl=m 

R[u,v] = Y (D%u,a**D*v). 
Fear ered 
Jal +|B|<2m 


By repeated applications of Lemma 7.7.4 we get 

|RLu,u]| < Ky |u|] lll m1 
where K, depends on m and n, but not on u. By using the Fourier transform 
together with (7.7.7) and (7.7.8), we get 


BluuJ= Y (D*u,aD*u) 
lal =|Bl=m 


=n" Ye (UEPULAHIEYU) 


a|= 


Y (anaes) ae, 


lal=|8 


=(2n)"*f 1u@P| 


Thus by (7.7.4) we get 
Re Blu,u] &(2n) "eo | |UCIIEP™ ae. 


514 ANALYSIS OF UNBOUNDED OPERATORS 


Now by applying the inverse Fourier transform together with (7.7.5) we get 
Re Blu,u] = co Y, (D%u,D*u). 
|o 


|=m 
Combining these two inequalities we get 
Re B[u,u] = Re B[u,u] — |R[u,u]| 
= Co > (D*u,D*u) — Ky|lulln ull m—1- (7.7.10) 


|a|=m 


Since 


K 2 
iz ce Illes — ou}, 
Oo 


for all o > 0, we get 


K 2 
2 1 
K lll lll —1 S O7 (lla? + ee) ike 


and by applying Lemma 7.7.3 we then get 
die Big i 2 2 
Ky [Ulm lullm—-1 SOM ln + 42 lll + Cellullo”). 


In other words, 
K lu lln Wtllm—1 S17 Ml” + Kz llullo’s (7.7.11) 


where 7 can be made arbitrarily small (at the expense of making K, large). 
Similarly one has 


elm = elma o 2 (D*u, D*u) 


< Y¥. (D%, D') +6 lull? + ¢ llullo2. nine) 


jaj=m 
Now by applying (7.7.11) with 7 < co/3 and (7.7.12) with ¢ = 4, to (7.7.10) we get 
Re Blu,u] > $collullm” — Kollullo’, 


for an appropriate choice of ky. J 
Inequality (7.7.6) can actually be extended to hold for all ue Hy”(Q). 


7.7.5 COROLLARY. Under the hypotheses and notation of Theorem 7.7.2, one 
has 


Re BLu,u] > c, 02 an = Kollullo? 


for allue Hy"(Q). 


7.7. ELLIPTIC PARTIAL DIFFERENTIAL OPERATORS 515 


Proof: Let ue Hy"(Q) and choose a sequence {u,;} in Cy”(Q) with the 
property that 


||“ — ujllm 20 as jo. 
Since 
| lejllo — llello| < lay — allo < uj; — all 
and 
| Neel — Ueellm| < ley — ell mn 


we see that ||4;l|m— ll¥llm and ||u;llo — ||ullo as j > co. Furthermore, if we apply 
the Schwarz Inequality to (7.7.3) we get 


[Blue ll s 2 | D*u|o lla**D°v||o 
sm 


< K |u| (lll mn 
for some constant K. Thus 
|Blu;,u;] — Bluu]| = |Blu; vu; —u] + Blu; — uu]| 
< K(lejllm + Well dll; — “ll in 
< K|u,; -— ull, 


since ||u;llm iS bounded. Hence Blu;,u;]— Blu,u] as j— 00. Since Inequality 
(7.7.6) holds for each u,, it also holds in the limit. Jj 


EXERCISES 


The first five exercises will lead to a proof of Lemma 7.7.3. 
1. In addition to the hypotheses of Lemma 7.7.3 assume that for i <j one has 


*u|? dx < Bul? dx + ——— [ ful? 7. 
y [ \D ul? dx<e > j |p u| dx + aa | lal dx (7.7.13) 


Ja] =i B\=j 


for all ue Cy*(Q), where c depends on j and Q. Show that Inequality (7.7.9) 
holds. 


a Xy ata a+ 3a X2 b 


Figure 7.7.1. 


2. In this exercise we shall prove Inequality (7.7.13) for i= 1, j=2, and n = 1, 
that is, Q is an interval with length |Q|. First divide Q into intervals each of 
length < Je and >J/e/2. Let (a,b) denote a typical subinterval and set « = 
(b — a)/4. Note that a? < «/16 and «~* < 64/e. Let x, and x, be arbitrary points 
in (a, a+ a) and (a + 3a, 5), respectively. 


516 ANALYSIS OF UNBOUNDED OPERATORS 


(a) 


(b) 


(c) 


(d) 


Show that for any x € (a,b) one has 
b 
|Du(x)| < ee teal + { |D7u(é)| dé. (7.7.14) 
Show that 


b b 
x? |Du(x)| <4 f |u(E)l dé + a? | [D®u(é)I dé. (7.7.15) 


[ Hint: Integrate Inequality (7.7.14) with respect to x, and x, over the intervals 
(a, a+ a) and (a + 3a, b).] 
Show that 

b Cc b b 

{ \Du|? dx < mal ul? dx + co? { \D2u|? dx. (7.7.16) 

a X “a a 
[Hint: Apply the Schwartz Inequality to (7.7.15) and then integrate with 
respect to x.] 
Prove (7.7.13) for the casei=1,j/=2,n= 1. 


. Use Exercise 2 to prove (7.7.13) for the case i= 1, 7=2,n> 2, and Q is a cubo 


with sides parallel to the coordinate planes. 

[Hint: Note that (7.7.16) holds when Du = édu/0x; and the integration is with 
respect to the x,-variable. Now integrate (7.7.16) with respect to the other 
variables. | 


. Use Exercise 3 to prove (7.7.13) for an arbitrary bounded domain Q. 


5. Prove (7.7.13) in general. [Hint: Note that (7.7.13) is trivially true for i = 0), 
Now apply mathematical induction on j.] 


8. 


. Prove Lemma 7.7.4. [Hint: Apply the Schwarz Inequality. ] 


THE DIRICHLET PROBLEM 


The Dirichlet problem is one of the fundamental problems of mathematical 
physics. Before giving a mathematical formulation of this problem it is helpful to 
recall some of the underlying physical problems. 


EXAMPLE |. (STATIONARY MEMBRANE.) Consider a membrane stretched over 
a bounded region Q in the plane R?. Let u(x,y) denote the displacement of the mem- 
brane when some external force f(x,y) is acting on the membrane. The relationshijy 
between u and fis then given by 


Cu dru 
ee aye in Q, 


u=O0 on 0Q, 


where 0Q denotes the boundary of Q. J 


7.8. THE DIRICHLET PROBLEM 517 


EXAMPLE 2. (ELECTROSTATIC FIELD.) Let f(x,y,z) denote the electric charge 
density of a region Q in R°. If u(x,y,z) denotes the potential of the electrostatic field 
generated by f, then uw and fare related by 


O*u d*u—s-s 07 


Au=—+—s+—=f (7.8.1) 


Let us assume that there is a function g, defined on the boundary 0Q, and we seek 
a solution uw with the property that 


u=g on 0Q. (7.8.2) 


This would arise if the boundary were insulated or contained a fixed charge. 
The boundary-value problem (7.8.1), (7.8.2) can be reduced to (7.8.1), (7.8.3), 
where 


u=0 on 0Q, (7.8.3) 


provided the boundary 0Q and the function g are sufficiently smooth. This is done 
as follows: If 6Q is of class C* and g is a C?-function on 6Q, then there is a C?- 
function v defined on QU dQ with the property that v =g on 0Q, see Friedman 
[2, p. 38]. If we set w = u — v, then the boundary-value problem (7.8.1), (7.8.2) is 
equivalent to 


Aw =f — Av inQ 
w =0 on oQ. fj 


EXAMPLE 3. (DEFORMED PLATE.) The physical properties of the deformed 
plate are similar to those of the stationary membrane. However, in this case the 
relationship between the displacement u(x,y) and the external force f(x,y) is given 
by the biharmonic equation 

6*u O*u 6*u 

—+2—-5+-5= in Q. 

pee ot aye ayn 
If the plate is fastened at the boundary, 0Q, the boundary conditions take on the 
form 


u=0, —=0 on 0Q, 
n 
where du/dn denotes the outward normal derivative of u. J 


Let us now formulate the Dirichlet problem. We shall give two formulations. 
In the first, we refer to a classical solution whereas in the second we introduce the 
concept of a weak solution. The reader will see that every classical solution of 
the Dirichlet problem is also a weak solution. We shall not attempt to discuss the 
converse question (namely, when is a weak solution also a classical solution) here. 
This would take us too far afield. Instead we refer the reader to excellent treatments 
in Agmon [1], Bers, John, and Schechter [1], Friedman [2], and Garabedian [1]. 


518 ANALYSIS OF UNBOUNDED OPERATORS 


7.8.1 DEFINITION. Let Q be a bounded region in R” and let L be an elliptic 
operator of order 2m on Q given by 
Lu= ¥  (-1)!*!D%(a%*D*u). (7.8.4) 
O<|a|,|f|<m 
The classical Dirichlet problem is the following: Let f be a continuous function on 
Q. Find a function uw in C?"(Q) with the property that 


Lu=f in (7.8.5) 
Ou . 
aa on 0Q, O0<j<sm-—l, (7.8.6) 


where 6/u/dn/ denote the outward normal derivatives of vu. Such a function u would 
then be called a classical solution of the Dirichlet problem.'? 

If u is a classical solution of the Dirichlet problem and @ €C),)°(Q), then 
(¢,f) = (¢,Lu). If we integrate the last equality by parts we get 


(Of) = BLo,u], 


where 


B[dujJ= » (D%¢,a"*D*u). 
O< {a}, {Bl <m 
Furthermore, one can show that if the boundary ¢Q is sufficiently smooth and if 
u is a classical solution of the Dirichlet problem, then ue Hy)"(Q). (See Friedman 
[2, pp. 39-40].) We will not do this here, but instead we shall use this fact to 
motivate the following definition. 


7.8.2 DEFINITION. Let 2 be a bounded region in R” and let L be an elliptic 
operator of order 2m on Q given by (7.8.4). The generalized Dirichlet problem for 
L is the following: Let f be a function in L,(Q). Find a function uw with the property 
that 


(¢,f) = Bld,u], for all 6 €e Cy*(Q), (7.8.7) 


and 
ué Hy”"Q). (7.8.4) 


In this case, u is called a (weak) solution of the Dirichlet problem, or sometimes u 
solution of the generalized Dirichlet problem. 

For any complex number / the generalized Dirichlet problem for L + A/ in 
defined similarly. In this case the sesquilinear functional B[¢,u] is replaced by 


Bl ou] = (¢, (L + ADu). 


Let us now prove the following existence theorem, which gives a solution of 
the generalized Dirichlet problem. 


12 The boundary conditions (7.8.6) are called homogeneous boundary conditions. Jnhomogencour 
boundary conditions would arise if we set 6/u/én’ = g,. However, the inhomogencous boundaty 
conditions can be reduced to homogencous boundary conditions by techniques similar to thane 
discussed in Example 2. (See Friedman [2, p. 38].) 


7.8. THE DIRICHLET PROBLEM 519 


7.8.3 THEOREM. Let L be given by (7.8.4) where the coefficients a** are constant, 
and assume that L is strongly elliptic. Then there is a real number ky such that for 
any k>kg the generalized Dirichlet problem for L + kI has a unique solution. 
Moreover, (L + kI)~! exists and is a compact operator. 


Proof: The proof of this result relies on the Lax-Milgram Theorem (Theorem 
5.21.2) as well as Corollary 7.7.5. 
Let BLu,v] be given by Equation (7.7.3). As noted in the proof of Corollary 
7.7.5, one has 


|BLu,v]| < Kull nllollm 
for all u, v € Hy”(Q). Furthermore, if k is any real number, then 
|B.Luv]] < |BLue]| + \(u,ko)| 
< Kul wlll m + ll lullollello 
S(K + k)Iludll mlloll n- 
Now let c, > 0 and ky be given by Gardings Inequality, that is, 
Re B[u,u] = cy |u| m” — Kollullo” 
for all ue Hy™(Q). If k > ky, then 
|B, [u,u]| => Re B,[u,u] = Re Blu,u] + (u,ku) 
> Cy[lullm + (k — ko)ilullo? = cr llull n?- 
Hence the sesquilinear function B,[u,v] and the Hilbert space H)”(Q) satisfies the 


hypotheses of the Lax-Milgram Theorem (Theorem 5.21.2). 
Now let /: Hy"(Q) — C be given by 


lo) = @.f), 
where fe L,(Q). Since 


CP) < Ilo flo < WF llollOllm: 


we see that /is a bounded linear functional. Thus by the Lax-Milgram Theorem 
there is a unique uw e€ Hy”(Q) with 


(o) = (Gf) = BL¢,u] 


for all @ in Hyp"(Q). This establishes the existence of a weak solution uw. 

Finally, if L is symmetric the compactness of (L + kI)~* follows from Corol- 
lary 7.6.5. The general case (where L is not symmetric) follows from Exercise 8, 
Section 6. fj 


One can say more about the generalized Dirichlet problem if the operator L 
contains only highest-order terms, 


Lu = (—1)"D%(a"*DFu). (7.8.9) 


Jal = [P| =m 


520 ANALYSIS OF UNBOUNDED OPERATORS 


7.8.4 COROLLARY. In addition to the hypotheses of Theorem 7.8.3 assume that 
L is given by Equation (7.8.9). Then the generalized Dirichlet problem for L has a 
unique solution. Moreover L~' exists and is a compact operator. 


Proof: If one examines the proof of Theorem 7.7.2 carefully one sees that the 
constant k,, which appears in Equation (7.7.6), can be chosen to be zero when L 
is given by (7.8.9). This corollary now follows directly from the last theorem. J 


It should be noted that the only property of L that was critical in the proof of 
the last theorem (and corollary) was that L satisfied the Garding Inequality (7.7.6). 
We really did not use the fact that L had constant coefficients, other than to prove 
(7.7.6). This suggests that the methods described here may be extended to differ- 
ential operators with variable coefficients, and this 1s indeed the case. We refer the 
reader to Agmon [1] and Friedman [2] for these details. 

We see, then, that if Z is an elliptic partial differential operator that satisfies 
the Garding Inequality (7.7.6), then L has a compact resolvent. If L is also sym- 
metric,'* then it has a compact, self-adjoint resolvent, and one can use the eigen- 
value-eigenfunction representation of L as described in Section 1. The problem of 
finding the eigenvalues and eigenfunctions requires further analysis of the individual 
operator. However, there is a technique of finding the eigenvalues and eigen- 
functions which is very useful and deserves special mention. This is the method of 
separation’* of variables, which we describe in the following example. 


EXAMPLE 4. Let Q be the unit square 
{((xy):0<x<10<y<l} 


in R* and consider the Laplacian operator A on Q, with boundary conditions 
u = 0 on OQ. We seek solutions of the eigenvalue-eigenvector problem: 
Au = iu, on Q, 


(7.8.10) 
u = 0, on 0oQ. 


Let us assume now that the solution of (7.8.10) is of the form 


u(x,y) = X(x) Y(y). 
Equation (7.8.10) then becomes 


or by dividing by X Y we get 
1 d?X _ 1 d7Y 


eo ceeeee ee dee 81 
X dx? Y dy? ene 


13 Tf L is not symmetric, then one can use the theory of nonnormal compact operators presented in 
Section 6.14. 
14 Let not the ox and ass plow together. (Origin unknown.) 


7.8. THE DIRICHLET PROBLEM 521 


Since the left side of (7.8.11) depends on x and the right side depends on y we see 
that each side must be constant, that is, 


id’x , 1d*Y 
X dx? Y dy? 
Or 
d*X d*y 
ee... - ee] Sagi Vo 8. 
ee ee (7.8.12) 


The boundary conditions u = 0 on €Q now become 
X(0) = X(1) = 0, Y(0) = Y(1) = 0. (7.8.13) 


In other words, we have reduced the planar problem (7.8.10) to solving two 
Sturm;Liouville problems (7.8.12)—(7.8.13). We see then that w= —n’q? for 
n=0,1,2,... and (A—p)=—m?’n* for m=0,1,2,.... The solution of 
(7.8.12)—(7.8.13) is 


X(x) = cos nx, w= —n’r?, n=0,1,..., 


Y( y)=cos may, A—p=—m’n’, m=0,1,.... 


Consequently the solution of (7.8.10) is then 


u(x,y)=cosnmxcosmmy, A=—(n*+m?)an?, 


n=0,1,...,m=0,1,.... J 


EXERCISES 


1. Show that the eigenvalues for the problem 


Au = Au on Q, 


u=0 on 0Q, 
where 


Q = {(%,,X2,x3):0 < x; < 1, i= 1, 2, 3} 
are of the form 
A= —(1?+m?+n’)q?, 
where /=0,1,..., m=0,1,..., »=0,1,.... Find the corresponding eigen- 
functions. [Hint: Let u(x,,x2,x3) = V;(x,) W(x2,x3) and reduce this to Example 4.] 
2. Find the eigenvalues and eigenfunctions for the Laplacian operator Au on 


Q = {(x,,...,x%,:05x,Ss 1,1 <si<n}, 
with u=0 on 0Q, 


522 ANALYSIS OF UNBOUNDED OPERATORS 


3. 


Let Q be the unit square as given in Example 4. In each of the following exercises 
you are given a function f(x,y) and are asked to express the solution u(x,y) of 


07u ss 0*u 
ot By? = f(x,y) on Q, 


u=0 on 0oQQ, 


as a Fourier expansion in terms of the eigenfunctions of A. 
(a) S(x,y) =X. 

(b) f(x,y) = sin n(x + y). 

(c) f(x,y) = x01 — y). 

(d) f(x,y) =x + y. [Hint: Use (a).] 


. Find the eigenvalues and eigenfunctions for the Laplacian operator Au on 


Q = {(x,y): x? + y? <a’} 


with uw = 0 on 0Q. [Hint: Use polar coordinates and separate variables. ] 


. Find the eigenvalues and eigenfunctions for the Laplacian operator Au on the 


sphere 
CY = {0% ,,.%35%3)2 x47 + X27 + x,* < 7} 


with uv = 0 on OQ. [Hint: Use spherical coordinates and separate variables.] 


. Find the eigenvalues and eigenfunctions for the Laplacian operator Au on the 


cylinder 
Q = {(X4,X2,X3): x° ae 5° < a’, O< X34 < h} 


with uw = 0 on cQ. [Hint: Use cylindrical coordinates and separate variables. ] 


. Find the eigenvalues and eigenfunctions for the operator 


2 2 
Ou OE oiok 


|r Fee see calor ae 
7 ax? * ay? ax 


on 
Q= {(x,y):0<x<1O0<y<}]} 


with u = 0 on oQ. Is the operator L symmetric? 


. Consider the biharmonic operator 


O*u 7 O*u Ofu 

ax* * ax? dy? " ay* 

on the disk Q = {(x,y): x? + y* <a’}. 

(a) Show that if A?u = Au, where u= 0 on 0Q and u £0, then A> 0. [Hint: 
Use the Garding Inequality.] 

(b) Find the eigenvalues and eigenfunctions of A?u on Q with u=0 on AQ). 
[Hint: Let A = k* and write A?u — Au =0 as (A— k?)(A + k?)u =0. Now 
use polar coordinates to separate variables. ] 


A*u = 


7.9. THE HEAT EQUATION AND WAVE EQUATION 523 


9. THE HEAT EQUATION AND WAVE EQUATION 


Let L be an elliptic differential operator of order 2m defined on some region 
Qc R". In this section we shall discuss two partial differential equations 


Ou _ 


— = Lu, 
7 ian 
and 
07u 
re = Lu. 


The first equation is called the heat equation and the second is called the wave 
equation. The problem is to find a solution u(x,t) with xe Q, t>0 subject to 
appropriate boundary conditions, which we shall formulate momentarily. However, 
before doing this, there is another assumption which warrants some discussion. 

We shall assume later that the boundary conditions that determine L are 
homogeneous and that LZ is a symmetric operator with a compact self-adjoint 
resolvent. The conditions under which this assumption is satisfied are discussed 
in the Jast two sections. For such an operator, recall that there exists a set of eigen- 
values {11,,/12,...} and corresponding eigenvectors (or eigenfunctions) {¢,,@2,...} 
such that the eigenvectors form an orthonormal basis for L,(Q). Furthermore, if 
u=)_,(u,¢,), lies in the domain of L, then 


Lu= ae LAU Pr Pn . 


Heat Equation 


Let u(x,t) denote the temperature at a point x €Q € R" at some time f¢. Let 
us assume that the initial temperature distribution 


u(x,0) = f(x), xEeQ (7.9.1) 


is known and that the temperature is normalized so that on the boundary dQ one 
has 


u(x,t) = 0, xe od, t> 0. (7.9.2) 
We now seek to solve the heat equation 
Ou 
—=L 7.9. 
FF u (7.9.3) 


subject to the boundary conditions (7.9.1) and (7.9.2). 
Let us assume that 


u(x,t) = U(x)V(t). 
Then Equation (7.9.3) becomes 


A LU. 7.9.4 
7 (7.9.4) 


524 ANALYSIS OF UNBOUNDED OPERATORS 


Since the left side of Equation (7.9.4) depends only on ¢ and the right side depends 
only on x, we see that they must be constant. That is, Equation (7.9.4) becomes 


LU = NU, (7.9.5) 
dV 
——=)YV, 7.9. 
i V (7.9.6) 
The boundary conditions (7.9.2) can now be written as 
U=0 on 0Q., (7.9.7) 


Now assume that the operator L with the homogeneous boundary conditions 
(7.9.7) is a symmetric operator with a compact self-adjoint resolvent and let 
{U,,U,,...} be an orthonormal basis of eigenvectors with corresponding eigen- 
values {11,,143,...}. The eigenvalues are, of course real, and 


V(t) =e" 
is the solution of (7.9.6). Hence u(x,t) takes on the form 
u(x,t) = e’"U,(x), eG 2 saris 
The general solution of (7.9.2), (7.9.3) would then be 


u(x,t) = Ld e#"'U (x), 


where the coefficients d, are to be determined by (7.9.1). That is, 


f(x) = w(x,0) = Yd, Uy) 


If we assume that fe L,(Q) then d, = (f,U,) by the Fourier Series Theorem. Hence 
the general solution of (7.9.1), (7.9.2), (7.9.3) is 


(xt) = Yi (LU e"Ue). I 


Wave Equation 


Let u(x,t) denote the displacement coordinate of some wave phenomenon ual 
a point x €e Qc R" at some time ¢. For example, this may be a vibrating string, 
vibrating membrane, a resonating cavity or one of a multitude of other phenomemn. 
Let us assume that the initial distribution of uw and u, is given by 


u(x,0) = f(x), xEeQ (7.9.8) 
and 


“ (x,0) = g(x), xeQ. (7.9.9) 


Also assume that on the boundary 0Q one has 


u(x,t) = 0, x Ee 0Q, t>0. (7.9.10) 


7.9. THE HEAT EQUATION AND WAVE EQUATION 525 


We now seek to solve the wave equation 


— = Lu (7.9.11) 


subject to the boundary conditions (7.9.8), (7.9.9), and (7.9.10). 
To do this we can use the method of separation of variables as was done for 
the heat equation. The analysis will differ in that the equation for V(t) becomes 


d*V 
dt? 


Now assume that the operator L satisfies the conditions for the heat equation 
analysis, and in addition, that the eigenvalues {y,,u,,...} of L are positive. Thus 
if A = u,, V(t) becomes 


V(t) = a, e"' + be ~ i"! 


where <a, =,/ Ly and a, and 5b, are arbitrary. The general solution of (7.9.10), 
(7.9.11) then becomes 


u(x,t) = ) (a,e'"'+ b,e~"" )U,(x), (7.9.12) 
n=1 
where a, and 5, are to be determined by (7.9.8) and (7.9.9). That is, 


u(x,0) = f(x) = Yi (a, + BUC), 


4(x,0) = 9(x) = ¥. (yay — oy by) U(X). 


If fand g belong to L,(Q), then the Fourier Series Theorem implies that 
an at b, = (f,U,)s 
16 Ay a io, = (9, U,)» 
or equivalently, 
i 
a, a(f— — 9; U,), 
Cy, 
(7.9.13) 
b, i a(f+ a 9, U,). 


If we incorporate (7.9.13) into (7.9.12) we get the general solution of (7.9.8)- 
(7.9.11). §j 
EXERCISES 


!. Analyze the solution of the wave equation in the case where some of the eigen- 
values of L are negative. 


526 ANALYSIS OF UNBOUNDED OPERATORS 


2. Solve the heat equation 
Ou 07u 
pcg ay San 
at ax?’ 


u(x,0) = f(x), u(0,t) = u(i,t) = 0 


0<x<il, 0<t, 


on L,[0,J]. [Hint: Expand f(x) in terms of a suitable orthonormal basis in 
L.[0,!].] 


3. Solve the heat equation 


Ou 6*u 
— 1)* —~ 
Ot oad) 0x?’ 


u(x,0) = f(x), u(0,t) = u(1,t) = 0 


0<x<l, 0<t, 


on L,[0,1]. 
4. Solve the wave equation 
ot? Ax?” 
u(x,0)=f(x), —-u,(x,0) = g(x), 
u(0,t) = u(1,t) = 0 


0<x<Il, 0<t, 


on L,[0,1]. 
5. Solve the wave equation 
@? @? 
Saas + IPS, O<x<1, O0<t, 


u(x,0) = f(x), u,x,0) = 0, 
u(0,t) = u(1,t) = 0, 
on L,[0,1]. 


6. Solve the heat equation 


7) 

= = Au, X = (X,,X2,X3) EQ, 0<t, 
u(x,0) = [|x] = (x17 + x27 + x37)", 
u(0,t) = 0, 


u(x,t) = 0, for x € AQ, 


where © is the sphere of radius 1 centered at the origin and A is the Laplacian 
operator. 


7.10. SELF-ADJOINT OPERATORS 527 


7. Solve the wave equation 


Ou, 

aa k“Au, X = (X1,%2,%3) €Q, 0<t, 
u(x,0)=f(x),  —-u,(x,0) = g(x), 
u(0,t) = 0, 


u(x,t) = 0, for x € a, 


where Q is as given in Exercise 6. 


10. SELF-ADJOINT OPERATORS 


In many applications, especially in quantum mechanics, one is interested in 
studying a given linear operator L by means of its spectrum. More specifically, if L 
is a Symmetric operator one is interested in knowing whether L can be written as a 
weighted sum of projections.'> The answer, as we shall see in Example 1, requires 
that we look carefully at the domain Y, of the operator L. We begin by first 
defining the concept of the adjoint of an unbounded linear operator. 

Let L be a linear operator on a Hilbert space H. Recall that this means that L 
is linear and its domain @, is dense in H. The adjoint operator L* will be defined 
so that the equation 


(Lx,y) = (x,L*y) (7.10.1) 
is ‘‘ valid.’” However, we have to be more precise. First let us define 2* as follows: 
Q* = {ye H: (Lx, y) = (x,z) for some ze H and all xe Q,}. (7.10.2) 


Let ye Z* and assume that w and z satisfy 


(Lx,y) = (x,w) = (xz) 


for all x in 2,. One then has (x,w — z) = 0 for all x in ,, or w — z L G,, which 
implies that w— zl 9, = H. Hence w—z=0, or w=z. This means that the 
mapping y — z given in the definition of Y* maps y onto precisely one point z. 
We call this mapping L* and let 9* = Y,. be the domain of L*. Furthermore, it is 
easy to show that L* is linear and we ask the reader to do this. 


7.10.1 THEOREM. Let L be a densely defined operator ona Hilbert space H and 
let L* and D* be given by (7.10.1) and (7.10.2). Then L* is a linear operator on D*. 


'5 We are purposely making a simplification here. Many of the operators of mathematical physics 
have a nontrivial continuous spectrum and therefore these operators require a Spectral Theorem 
which uses weighted ‘‘integrals”’ of projections instead of weighted sums of projections. In any 
case, the concept of self-adjointness which we introduce in this section is the same regardless of the 
form of the Spectral Theorem. 


528 ANALYSIS OF UNBOUNDED OPERATORS 


EXAMPLE 1. Let L be a weighted sum of projections on a separable Hilbert 
space Hf. That is, 


Lx = 9 A,P, Xx; 
n=1 
where the domain of L is given by 


N 
Q(L) = {» eH: lim } A,P,x exists} 


N->o n=1 


The object of this example is to show that the adjoint L* is given by 
L*x=:) A, PS; (7.10.3) 
n=1 
and (more importantly) that the domain Z(L*) of L* satisfies 


AL*) = HL). 


Since {P,,} is a resolution of the identity, it follows that the closed linear spaccs 
A(P,) are mutually orthogonal and that 


H = 2(P,) + AP.) + °°". 


Without any loss in generality we can assume that A&(P,,) is one-dimensional. There- 
fore, if e, is a unit vector in &(P,), then {e,} is an orthonormal basis for H. The 
Fourier Series Theorem then assures us that any vector xe H can be expressed 
uniquely as 


x= : (x,e,)e, 
and that ™ 
(x9) = ¥ Ose.) 

Furthermore, we note that x « D(L) if and only if 

Y lay(s6,) ? < 00, 
Since P,, x = (x,e,)e, we see that 

Lx = Yala = ya, Px: 
Thus if x e Z(L) and ye H, then 
(Lx,y) =, Arsene). 

Furthermore, if there is a z € H such that 


(Lxy) = (2) = Y (nee) 


7.10. SELF-ADJOINT OPERATORS 529 


for all x e A(L), then 
(Z,€,) = Ay s€n): 
That is, if z = L*y, then 


Ly = DB An(vsenden = Li An Pa 


We see, then, that L* does satisfy (7.10.3). 

Let us now show that Z(L) = A(L*). First we note that the above argument 
shows that if ye D(L), then y e D(L*). Now assume that y € D(L*). Then there is a 
z €H such that for all x in D(L) one has 


(a wiH ees Gene ae 


n=1 


Since Z(L) is dense in H, this implies that 4,(),e,) = (z,e,) for all n, so 


z= DL, An(Ysenen = L*y. 


Now by the Parseval Equality we get 


Y a(vse9)I? < 00. 
Hence ye AL). § 


We would like to show that 2* is in general dense in H, that is, L* is densely 
defined; however, this requires the concept of a “‘closed’’ operator. This concept 
has been discussed in Exercises 9-14 of Section 5.6. and Exercises 13-14 of Section 
5.8. We shall give the definition again; however, it would be helpful if the reader 
would quickly review these earlier exercises. 


7.10.2 DEFINITION. Let ZL be a densely defined operator on a Hilbert space 
H, with the domain 9, . We shall say that L is a closed operator if for every sequence 
{x,} in 2,, with the property that both limits x = lim x, and y = lim Lx, exist, one 
has xe 9, and y= Lx. 


It is important to note that the concept of a closed operator does depend on 
the domain of the operator. It may happen that an operator L on Q, is not closed, 
but that there is an extension of L to L on Fj, where L is closed. In this case we shall 
say that the operator L is closable. The concept of closable operators is discussed 
further in the exercises. The reader may be assured that just about all linear 
operators of interest (bounded operators, differential operators, multiplicative 
operators, integral operators) are either closed or closable. 


7.10.3 THEOREM. Let L be a densely defined linear operator on a Hilbert 
space H with domain Q, and let L™ denote the adjoint with domain 2*. If L is a 
closable operator, then L* is a closed densely defined operator on H. 


530 ANALYSIS OF UNBOUNDED OPERATORS 


Proof: We will prove that L* is a closed operator. The fact that 2* is dense 
in H will be shown in Exercise 3, in this section. 
In order to show that L* is closed we let {y,} be a sequence in Y* with the 
property that both limits y = lim y, and z = lim L*y, exist. We then must show that 
y € Z* and z = L*y. However, the continuity of the inner product gives us 


(Lx,y) = lim(Lx,y,) = lim(x,L*y,) = (x,2) 
for all xe G,. Hence, ye Z* andz=L*y. J 


Since the adjoint operator L* is closed and densely defined one can define the 
second adjoint L** by the equation 


(L*y,z) = (y,L**z). 


By combining this with (7.10.1) one suspects that L = L**. If L is closed, then- - 
as we shall see in the exercises—one does have L = L**. The proof of the last 
equation is not a complete triviality for unbounded operators since one must prove 
two things: (1) the domains Y, and J, are the same; (2) the operators L and 
L** agree. 

We are now prepared to define self-adjointness. However, before doing this 
let us observe that the following theorem is a characterization of symmetric 
operators. 


7.10.4 THEOREM. A linear operator L: 9, — H is symmetric if and only if 
BQ, S Dry and Lx = L*x for all x € Q,. 


7.10.5 DEFINITION. A densely defined operator L: 9, > H is said to he 
self-adjoint if D, = D,. and Lx = L*x for all xe Q,. 


It follows from Example 1 that a weighted sum of projections is symmetric 
if and only if it is self-adjoint and this occurs if and only if the A,’s are real. 

We will conclude this section with a few examples. However, before turning to 
these let us observe that it is possible to define the concept of “normality” for 
unbounded operators. We will not do that here but instead we refer the reader tu 
Stone [1], p. 311 ff]. 


EXAMPLE 2. (THE POSITION OPERATOR Q.) This operator is defined in Example 
1, Section 4 and was shown to be symmetric. In order to show that it is self-adjoint 
we let v € Dox and set v* = Q*v. We wish to show that 


v*(x) = xv(x). (7.10.4) 


Since (Qu,v) = (u,v*) for all u in Dg, we have 


(c u(x)[xv(x) — v*(x)] dx =0, uEQDo. 


7.10. SELF-ADJOINT OPERATORS 53] 


That is xv(x) — v*(x) 1 u(x) for all ue Dg. Therefore, since Dp is dense, one has 
xv(x) — v*(x) = 0, which is (7.10.4). fj 


EXAMPLE 3. The Position Operator Q, defined in Example 2, Section 4 is 
self-adjoint. This can be proved by repeating the reasoning of the last example. J 


EXAMPLE 4. The Potential Operator V(Q) defined in Example 3, Section 
4 is self-adjoint. J 


EXAMPLE 5. The Momentum Operator P defined in Example 4, Section 4 is 
symmetric but not self-adjoint. The reason for this is that the domain Cy'(— 00,0) 
is not large enough. However, if we enlarge the domain, which means that we 
extend P to the space Zp of all absolutely continuous functions u in L,(— 0,0) 
with u’ € L,(— 00,00), then we can show that P 1s self-adjoint. 

First we recall that if ue Dp, then 
lim u(t) = 0, (7.10.5) 
t>+0 
see Exercise 15, Section 5.22. 
Now if u, v€ Dp, then by integrating by parts we get 


(Pur) — (u,Po) = [ [—in'(@)o(® — uw] ae 
= lim J [—iu'(S)o(E) — in(S)o'@)] ae 
= lim —i[u(T)o(T) — u(—T)(—T)] = 0, 


TT? 


by (7.10.5). Hence, P is symmetric on this larger domain. 
The fact that P is self-adjoint now follows easily from Theorem 5.22.8 and 
Example 1. J 


EXAMPLE 6. The Momentum Operator P, defined in Example 5, Section 4 is 
self-adjoint on the domain Dp, defined in Theorem 5.22.9. This can be seen by an 
obvious adaptation of the argument used in the last example. J 


EXERCISES 


1. Show that the adjoint operator L* is linear. (Does your argument use the fact 
that L is linear?) 


2. Let L: 9, > H be a linear operator and let G, denote the graph of L, that is, 
G,={{x,Lx}eH@ HA: xeQG;}. 


(a) Show that G, is a linear subspace of H @ H. 
(b) Show that L is a closed operator if and only if G, is aclosed linear subspace 
of H@H. 


532 ANALYSIS OF UNBOUNDED OPERATORS 


(c) Define V: HOH>HOHA by V{x,y} = {y,—x}. Show that (7.10.1) can 
be rewritten as 
(V{x,Lx}, {y,L*y}) = 0, (7.10.6) 


where x € Y, and ye J,.. 
(d) Use (7.10.6) to show that Gy. = V(G,)~. 
(e) Show that L is closed if and only if G, = V(G,.)*. 


3. The following argument, which uses Exercise 2, will lead to a proof that 9,. 
is dense in H when L is closed (Theorem 7.10.3). Let z 1 Dys. 
(a) Show that {0,z} 1 V(G,.) in H@ H. 
(b) Show that {0,z} € G, when L is closed. 
(c) Show that z= 0. 


4. Extend Exercise 3 to show that 2,. is dense in H when L is closable. 


5. Let L be a closed, densely defined operator on a Hilbert space H. Use Exercise 
2 and Theorem 7.10.3 to show that L = L**. 


6. Let L be aclosable, densely defined operatoron a Hilbert space H with domain 
Q,. We shall say that an operator L with domain Q;, is the closure of L if 
(a) Lis a closed operator, 

(b) Lis an extension of L, and 

(c) every other closed linear operator of L that is an extension of L is also an 
extension of L. 

Show that L has a closure and that the closure is uniquely defined. Show that 

L** is the closure of L. 


7. (a) Show that every symmetric linear operator is closable. 
(b) Show that a linear operator L is symmetric if and only if L* is an extension 
of L**. 


8. Let L be a densely defined operator on a Hilbert space H and let G, denote the 
graph of L. Let G, denote the closure of G, in H@H. Assume that G, hus 
the property that (x,y) and (x,y’) are in G,, then y = y’. In other words, the 
x-coordinate determines the y-coordinate, which we write asy = Lx. Showthat 
L is closable and that Lis the closure of L. 


9. Let {e,:n =0,1,...} be an orthonormal basis for an infinite-dimensional 
separable Hilbert space H and let Z denote the collection of all vectors x in / 
for which )°° 9 n|(x,e,)|7 < 00. The creation and annihilation operators, A ani 
A, are defined on D by 


Ax = » V n+ 1(X,€n4 Cn 
n=0 
and 
Ax = ¥ ./n(x,€,-1)p« 
n=1 


(a) Show that A and A are densely defined. 
(b) Show that for x, ye DZ one has (Ax,y) = (x,Ay). (The operators A and 
A are related to P and Q as we shall see in Section 14.) 


7.11. THE CAYLEY TRANSFORM 533 


10. Let f: R-— R be a continuous real-valued function on R and define f(Q) by 
S(Q): u(x) > f(x)u(x), 
where the domain of f(Q) is 
{u(x) € L,(— 0,00): f(x)u(x) € L,(— 00, 00)}. 


(a) Show that f/(Q) is self-adjoint. 
(b) Does f(Q) remain self-adjoint if we drop the assumption that f(x) be 
continuous ? 
11. Consider L = S, + S, + ® on /,(0,00), where S, and S, are the right and left 
shift operators, ® is given by 


D(X 1,X25.--) = (OC) 1,02) 2 5. - )s 


and ¢(n) is real-valued with |é(n)| > o asn— oo. 

(a) Define the domain of L and show that L is self-adjoint. 

(b) Show that L has a compact self-adjoint resolvent. (See Exercise 3, 
Section 6.9.) 

(c) Show that the spectrum of L is only point spectrum. 

12. (Continuation of Exercise 11.) Assume that é(n) > A) as n- 00. 

(a) Show that L does not have a compact self-adjoint resolvent. 

(b) Let {A1,} be the eigenvalues of L. Show that the sequence {/,} has no limit, 
finite or infinite. 


13. Show that a weighted sum of projections 1s closable. 

14. Extend the results of Example 1, to nonseparable Hilbert spaces. 

15. Let P be a self-adjoint projection on a Hilbert space H. Show that P is 
continuous. 


16. Show that a projection P on a Hilbert space H is orthogonal if and only if it is 
self-adjoint. (Compare with Theorem 5.23.9.) 


11. THE CAYLEY TRANSFORM 


We see, then, that if L is a symmetric operator that is not self-adjoint, then 

QD, = Dy but D, # D,.. This suggests that it may be possible to extend L, that is, 

enlarge 2, in such a way that the extension is self-adjoint. It is the purpose of this 

section to develop a theory for determining when such a symmetric operator has 

a self-adjoint extension. We will show, for example, that every symmetric differ- 

ential operator with real coefficients has a self-adjoint extension (Corollary 7.11.7). 
The main tool in this section will be the Cayley transform 


M=(L—i(L+ il“, (7.11.1) 


where L is a symmetric operator. Before proceeding, we note here, for reference, 
that the inverse Cayley transform is given by 


L=iUd+M)I- My"). (7.11.2) 


534 ANALYSIS OF UNBOUNDED OPERATORS 


In Exercise 1 we will show that if Z is a symmetric operator on a Hilbert space 
H, then the Cayley transform M is well defined. Indeed, if D,, is the closure of the 
linear subspace 


{fy=(L+4+i)x:xeQ;}, 
then for y=(L+ ilxeé Dy, My is given by 
My =(L — iDx. (7.11.3) 


Let 2, = M(QG,) denote the range of M. We will also show that || My|| = ||y|| for 
all y € D,,. That is, the Cayley transform M is an isometry on Q,, and, therefore, 
Ry ts a Closed linear subspace of H. (This does not mean that M is a unitary 
operator since it may happen that D,, # H.) 

This ends our discussion of the general case where L is symmetric. We now 
ask, what happens to M when Lis a self-adjoint operator? The answer is given 
in the next theorem. 


7.11.1 THEOREM. Let L: 9, —-H be a symmetric operator and let M be the 
Cayley transform of L. Then the following statements are equivalent: 


(a) L is self-adjoint. 

(b) M is a unitary operator. 
(c) Dy = Hand &y = H. 
(d) Dy = {0} = Ry. 


The proof of this theorem is presented in Exercise 2 in this section. 

Some further properties of the inverse Cayley transform are discussed in the 
exercises. 

In order to see how the Cayley transform can be used in the study of symmetriv 
Operators, let us assume that we are given an operator L that is symmetric but 
not necessarily self-adjoint. This means that 


Leper. (7.11.4) 


where we use the notation A © B to depict the fact that B is an extension of 4 
The proof of (7.11.4) is not difficult. The relationship L ¢ L* follows from the fact 
that L is symmetric. The relationship L ¢ L** and L** ¢ L* follows from the fic 
that L** is the closure of L, by Exercises 6-7 of Section 10. 

Now if L is not self-adjoint, then its domain Q, is strictly smaller than ‘/7,.. 
This suggests that we may try to extend L in order to “make it”’ self-adjoint. 
Furthermore, if L is some extension of L, that is, L < L, then it follows from the 
definition of the adjoint that [* ¢ L*. Thus an extension of L will not only enlarge 
Q,, but it will also shrink Q,.. Finally, if Lis a symmetric extension of L, then we 
must have 


bebe Pror (7.11.9) 


7.11. THE CAYLEY TRANSFORM 535 


If we can choose the extension Lin such a way that L= DL*, then we see that 
L = £*, that is, L is self-adjoint. 

The problem we now wish to study is under which conditions does a symmetric 
operator have a self-adjoint extension. We will see shortly that just about all of the 
symmetric operators discussed in this chapter have self-adjoint extensions. 

The proof of the following lemma is left as an exercise. 


7.11.2 LEMMA. Let L be a symmetric linear operator on a Hilbert space H 
and define the Cayley transform M by (7.11.1). Then the following statements are 
valid. 

(a) If Lis a symmetric extension of L and M is the Cayley transform of L, then 
M is an extension of M. 

(b) If M is an isometric extension of M and if Lis the inverse Cayley transform 
of M, then Lis a symmetric extension of L. 


Let L be a symmetric operator on a Hilbert space H and let M be the Cayley 
transform of L. The subspaces D,,* and #,,* are called the deficiency subspaces of 
L. Let m=dimQ,,+ and n=dim &,,*. Then (m,n) are called the deficiency 
indicies of L. It follows from Theorem 7.11.1 (d) that LZ is self-adjoint if and only if 
the deficiency indicies are (0,0). 

We can now determine which symmetric operators have self-adjoint extensions. 


7.11.3 THEOREM. Let (m,n) denote the deficiency indicies of a symmetric 
operator L. Then L has a self-adjoint extension if and only ifm = n. 


Proof: First assume that L has a self-adjoint extension and let L denote this 
extension. Let M and M be the Cayley transforms of L and L, respectively. Since 
M is an isometric extension of M it is evident that the deficiency indicies for L are 
of the form (m — p,n — p). However, Lis self-adjoint and therefore 


m—p=n—p=), 
orm=n=p. 
Now assume that m = n, and let M: Dy > By, be the Cayley transform of L. 


Since dim J,,* = dim @,,* there is an isometric mapping N of DJ, onto By, 
by Exercise 12, Section 5.19. Since 


H=9yut+Du =BaurBu > 
we define M: H > H by 
M(x + y) = Mx + Ny, 


where x € Dy and ye Dy. It is easy to see that M is an isometric extension of M 
and that M is a unitary mapping. It follows from Theorem 7.11.1 and Lemma 
7.11.2 that the inverse Cayley transform Lis a self-adjoint extension of L. J 


536 ANALYSIS OF UNBOUNDED OPERATORS 


We shall give here one criterion that a symmetric operator L have equal 
deficiency indicies. Other criteria are presented in the exercises below. 


7.11.4 DEFINITION. A mapping J: H- 4H is said to be a conjugation if 
J* = I and (Jx,Jy) = (y,x) for all x and y in H. 


For example, if H is a complex L,-space and Jf = f, where the bar denotes 
complex conjugation, then J is a conjugation in the sense of Definition 7.11.4. 


The following lemma is an easy exercise for the reader. 


7.11.5 Lemma. Let J be a conjugation on a Hilbert space H.Then J is a one-to- 
one mapping of H onto itself with J~'! = J. Furthermore one has 


J(x+y)=Jx4+ Jy, x, yeEdH, 
J(ax) = ax, aeC, xed. 


7.11.6 THEOREM. Let L be a symmetric operator on a Hilbert space H and 
assume that there is a conjugation J on H with the property that JLx = LJx for all 
x €Q,. Then L has a self-adjoint extension. 


Proof: First we note that JL = LJ on Q, implies that J is a one-to-one 
mapping of Z, onto itself. Furthermore since 
J(L + il)x = (L — il)J x 
for all x € D,, we see that J is a one-to-one mapping of 
A, ={y=(L4+i)x:xeQG,} onto @_ ={y=(L—i)x: xe Q,}. 


Hence J is a one-to-one mapping of Dy = #, onto #,, = A_, where M denoten 
the Cayley transform of L. 
Now if ze By, then for all x e , one has ((L + iJ)x,z) = 0. Furthermore, 
(Jz, (L — il)x) = (Jz,J?(L — i)x) 
= (J(L — il)x,z) = (L + iD)Jx,z) = 0. 
Hence J is a one-to-one mapping of Jy," onto &,,". Finally, this implies that 


dim J," = dim #y,", so L has equal deficiency indicies and therefore has a sel! 
adjoint extension. § 


The following corollary is now immediate. 
7.11.7 COROLLARY. Let L bea differential or integral operator on the comple 
space L,(Q). Assume that L has real coefficients. If L is symmetric, then L. has a 


self-adjoint extension. 


Proof: We simply note that LJu = JLu for allue 9,, where Ju=ii. J 


7.11. THE CAYLEY TRANSFORM 537 


EXERCISES 


1. Let L: J, > H be a symmetric operator and let M = (L — iD(L + il)™! be the 
Cayley transform of L. 
(a) Show that ||(Z + i)x|| = ||(Z — i1)x|| 2 ||x|| for all xe D,. 
(b) Let Dy denote the closure of {y=(L+ilx:xeQ,}. Show that for 
y=(L+ iDxe Dy one has My = (L — ilx. 
(c) Show that ||My|| = ||y|| for all ye D,,. 


2. This will lead to a proof of Theorem 7.11.1. 
(a) Use the results of Section 5.15 to show that 7.11.1 (c) and 7.11.1 (d) are 
equivalent. 
(b) Use Theorem 5.19.2 to show that 7.11.1 (b) and 7.11.1 (c) are equivalent. 
(c) Define 2, and &_ by 


A,={y=(L4+i)x: xeQG,}, R_={y=(L—i)x: xe QG,}. 


Assume that 7.11.1 (a) holds. Let ze @,* and show that z is in the domain 
of (L —il) and that (L —il)z 1 2,. Hence #,+ = {0} and, similarly, 
R_* = {0}. Then show that 7.11.1 (c) holds. 

(d) Assume that both 7.11.1 (b) and 7.11.1 (c) hold and let x € Z,.and x* = Lx. 
Show that 


*« * 


x — 1X Xx — 1X 


2 


Next show that L = i+ M)\J— M)“', that is, if z = (1 — M)w for some 
we H, then ze QY, and Lz=i(1+ M)w. Finally show that xe Q@, and 
XSL. 


3. (a) Prove Lemma 7.11.2. 
(b) Prove Lemma 7.11.5. 


4. (a) Let M be the Cayley transform of a symmetric operator L on a Hilbert 
space H. Show that the range of M — J is dense in H. 

(b) Let M: Dy > By be an isometric operator, where Dy and &,, are closed 
linear subspaces of a Hilbert space H. Assume that the range of M — J is 
dense in H, and define L by Equation (7.11.2). Show that L is a closed 
densely defined symmetric operator. 


x =(I—M) ,  x* =i(1 + M) 


5. (Semibounded Symmetric Transformations.) A symmetric operator S on a 
Hilbert space H is said to be semibounded if there is a real number «& such that 


(Sx,x) > a(x,x) for all x in Dy. (7.11.6) 


(a) Let S satisfy for (7.11.6) and let T= S — aJ. Show that T is a symmetric 
operator with (Tx,x) > (x,x) for all xin Z,. Show that S has a self- 
adjoint extension if and only if T has a self-adjoint extension. 

(b) Let <x,y> = (Tx,y). Show that <¢x,y> defines a new inner product on 
Q,. Let |||x|| = (<x,x«>)'/* be the norm and show that |||x|l| = ||]. 


538 ANALYSIS OF UNBOUNDED OPERATORS 


10. 


11. 


(c) Let Hy, denote the completion of the inner product space (97, <°,°>). 
Show that H, can be identified with a linear subspace of H with the pro- 
perty that D; © Hy € H. Show that for all x in Hy one has ||| x||| => ||x|. 

(d) Let ye H be fixed. Show that the linear functional /,(x) = (x, y) satisfies 


2,00) = (x,y) S Mell yl S Ml Ill 
for all x € H,. Hence, there is a z € Hy such that 
L(x) = (x,Z). 
Let B: H > Hy be the mapping z = By. Show that B is a bounded linear 


operator on H. Show that B is self-adjoint and that A = B™’ exists. 
(e) Show that A is self-adjoint and that A is an extension of T. 


. Show that a symmetric operator may have several self-adjoint extensions. For 


example, consider 
Lu=u'", 


where 9, = {ue L,(0,1): u(0) = u'(0) = u(1) = w'(1) = O}. 


. A symmetric operator L, with the property that L** is self-adjoint, is said to 


be essentially self-adjoint. (Recall that L** is the closure of L, see Exercise 

6, Section 10.) 

(a) Show that an essentially self-adjoint operator has a unique self-adjoint 
extension. 

(b) Show that a symmetric operator on a Hilbert space H is essentially scll- 
adjoint if and only if 2, and @_ are dense in H. 


. Let L be a densely defined bounded symmetric operator on a Hilbert space //, 


Show that L is essentially self-adjoint. 


. Let M be the Cayley transform of a self-adjoint operator L, and assume tht 


L~' exists and is densely defined. 
(a) Show that L~' is symmetric. 
(b) Show that the Cayley transform of L~' is —M7!. 
(c) Show that L~? is self-adjoint. 


Let L be essentially self-adjoint on a Hilbert space H and, Y, the domain of /.. 
Show that for all complex numbers 1 with Im 4 £0, the space (L —A/)(‘7,) 
is dense in H. 

(Converse of Exercise 10.) Let L be a symmetric operator on a Hilbert space 
H with domain 9,. Assume that for some complex number 4 with Im A 40) 
the spaces (L — AI(9,) and (L — AI)\(Q,) are dense in H. Show that /. In 
essentially self-adjoint. [Hint: First note that if Z is dense in H and if 7’ is a 


bounded linear operator with |7|| < 1, then (7 + T)(Q) is dense in //. Newt 
show that 


1 
c+1 


for an appropriate choice of c,0 <c < 1, where 


T =(L—AI(L — AD] 


(L + il) = [eT + I(L— 4) 


12. 


13; 


14. 


16. 


17. 


12. 


7.12. QUANTUM MECHANICS, REVISITED 539 


Let S be a symmetric operator and let 7 be an essentially self-adjoint operator 
on a common domain Y. Assume that there is an e, 0 < e < 1, and a constant 
K such that 


| Sul] < el] Zul] + Alla 


for all we Z. Show that S + T is essentially self-adjoint. [Hint: Choose r so 
that ¢ + K[r|~' <1 and show that 


S(T + irt)~* || <1. 
Then apply Exercise 10 while noting that 
(S+ T+ irnND = {ST + irl)! + D{T+ irT}}D isa dense in H.] 


Extend Exercise 12 to show that if T is self-adjoint on J, then S' + T is self- 
adjoint on J. 


Let S be a symmetric operator and let T be an essentially self-adjoint operator 
on a common domain &. Assume that there are nonnegative constants a and 
B such that 


|| Sull* < e(u,Tu) + Blluil’. 
Show that S + T 1s essentially self-adjoint. 


. Let L be a symmetric operator on a Hilbert space H and assume that there 


exists an orthonormal basis consisting entirely of eigenvectors of L. Show that 
L is essentially self-adjoint. (Use this fact to compare Theorem 7.3.3 with the 
Spectral Theorem.) 


Show that the linear operator 
Lu = = 
dt 
with domain 
9, = {ue L,(0,c): u(0) = 0 and uw’ € L,(0,00)} 
is symmetric on L,(0,00) but that it has no self-adjoint extensions. 


Let L=),4,P, be a weighted sum of projections, where the 4,’s are real. 
Show that the Cayley transform is given by 


Mod (TP 


ee, 


QUANTUM MECHANICS, REVISITED 


Let us now illustrate how the concepts of self-adjoint operators and self- 


adjoint extensions of symmetric operators are used in the study of quantum 
mechanics. In this section we will show how the energy function in a classical- 
mechanical system becomes a self-adjoint operator in the corresponding quantum- 
mechanical system. In the next section we shall look at the Heisenberg Uncertainty 


540 ANALYSIS OF UNBOUNDED OPERATORS 


Relation and, finally, in Section 14 we shall analyze the quantum-mechanical 
harmonic oscillator. 

Recall that in Section 5.25 we gave a brief introduction into the foundations of 
quantum mechanics. We noted that the observables of a quantum-mechanical 
system can be identified with the self-adjoint operators on an infinite-dimensional 
separable Hilbert space H. Of course, all such Hilbert spaces are unitarily equivalent. 
Therefore, when one is trying to study a particular quantum-mechanical system, 
one has a choice for the mathematical model. 

It is customary to make a “‘standard”’ choice for certain quantum-mechanical 
systems, in particular for a system consisting of / particles interacting with one 
another. This type of system arises in atomic physics. 

In the classical-mechanical system the differential equations of motion can be 
represented in the Hamiltonian form 


where g = (q,,....9,) represent the position coordinates, p = (p,,...,p,) represent 
the “ generalized’? momenta coordinates, and H = H(q,p) ts the Hamiltonian 
function. For the quantum-mechanical model one then makes the identification 
(x,,. os Xn) aad (q1,. ‘ac In)» 

UN. VQ; 

DP, hP,,, 


for 1<k <n, where Q, and P, are the operators on L,(R") given in Examples 2 
and 5 of Section 4. As shown in Section 10, the operators P, and Q, are sell- 
adjoint. The Schrédinger operator for this quantum-mechanical system then 
becomes 

S = H(Q,hP) = H(Q,,...,Q, AP,... AP ,)s (7.12.1) 


where / is a universal constant. For example, in many cases the Hamiltonian ts of 
the form 


A(,p) = (Pi? +7 + Pn) + VGts- «Ins 
where V is a real-valued potential function. Then S is uniquely determined by 
S= —h’A, + V(Q), 


where A, is the n-dimensional Laplacian operator given in Example 6, Section 4 
and V(Q) is the potential operator described in Example 3, Section 4. Since Sis u 
symmetric differential operator on C,~(R") with real coefficients we see that 
has a self-adjoint extension. 
The dynamical equations of motion [Equation (5.25.5)] for this system then 
become 
op 


22 SS a: 
oe 


where @ is a function of (x,,...,x,) and 1. 


7.13. HEISENBERG UNCERTAINTY THEOREM 541 


EXAMPLE 1. (HYDROGEN AToM.) For the hydrogen atom one has a single 
electron interacting with a nucleus. Thus n= 3, A, = A;, and the potential V 
becomes the Coulomb potential 

V(41-92093) = —klqi? + 92° + 493°)", 
where k is a positive constant. J 


EXAMPLE 2. (HELIUM ATOM.) For the helium atom one has two electrons 
interacting with a nucleus. Thus ” = 6, A, = A,, and the potential V becomes 
V(41,92 973 44 275 96) = k[|u ms v|~* oe 2\ul~’ _ Ze) 


where k is a constant, u = (q,,92.93), U=(4:95,96), and | | denotes the 
Euclidean norm, that is, |u| = [9,7 + q.? + 437]'/?. 

For a detailed study of hydrogen and helium operators we refer the reader to 
Hellwig [1] and Kato [1]. J 


We conclude our story by considering two applications in quantum mechanics. 
The first is a discussion of the Heisenberg Uncertainty Theorem and the second is 
a discussion of the quantum-mechanical harmonic oscillator. 


13. HEISENBERG UNCERTAINTY THEOREM 


Let Land M be two observables (that is, self-adjoint operators) for a quantum- 
mechanical system and let N be defined by 
iN=LM — ML. 
Assume that L and M are so defined that the commutator N given above is densely 
defined. [Note: N is symmetric. | 
Consider now a pure state that is represented (in the sense of Theorem 5.25.1) 
by a unit vector e in H. This means that the expected value of L and M in state e 
is given’® by 
a = E(L) = (Le,e), 
B = E(M) = (Me,e). 
The deviation of the observable L in state e is given by 
(Aa)* = E(L — al)’) = (L — al)’e,e) 
= |\(L — adel’, 
since L is self-adjoint. Similarly, one has 


(AB)* = I(M — Brel’. 


7.13.1 THEOREM. In the above notation one has 


(Aa)(AB) = $/E(N)| = 41(Ne,e)I. (7.13.1) 


16 We assume that the vector e lies in the common domain of L, M, and N. 


542 ANALYSIS OF UNBOUNDED OPERATORS 
Proof: Let T=L—oal+ip(M — BI), where p is real. Then 7 is densely 
defined and 
T* = (L— al) — ip(M — BI). 
Furthermore, 
TT* = (L—al)? + pN+ p?(M — BD”. 
Hence, 
0 < (T*e,T*e) = (TT*e,e) 
= ((L — al)*e,e) + p(Ne,e) + p?((M — BI)*e,e). 
That is, 
0 < (Aa)? + pE(N) + p?(AB)’. (7.13.2) 
Since (7.13.2) is valid for all real p, the discriminant’ ’ 
[E(N)]* — 4(Ac)?(AB)? 
is nonpositive, that is, 
JE(N)| < 2(Aa)(AB). | 
EXAMPLE 1. (HEISENBERG UNCERTAINTY PRINCIPLE.) Equation (7.13.1) 


reduces to the Heisenberg Uncertainty Principle when L is the position operator Q), 
and M is the momentum operator P, on L,(R"). In this case, it is easy to see that 


(O,P, — P,O,)u = iu 


for all functions uw in C)*(R"). In other words, the operator N becomes the identity 
and (7.13.1) becomes 


(Aa)(AB) = 3. 
If instead we replace M with hP,, where / is a constant, then (7.13.1) becomes 


(Aa) AB) = 5, 


which is the original result of Heisenberg. 


EXERCISES 


1. Theorem 7.13.1 can be extended to quantum-mechanical states given by i 
density operator W. In this case, one hasa = E(L) = tr WL, B = E(M) = tr WAI, 
(Aa)? = tr[W(L — I)?], and (Af)* = tr[W(M — BI)*]. Also Equation (7.13.1) 
becomes 


(Aa)(AB) = 4|tr WN]. (7.13..3) 


17 Compare this argument with the proof of the Schwarz Inequality. 


7.14. THE HARMONIC OSCILLATOR 543 


We now outline the proof: 
(a) Show that 
tr WTT* = tr T*WT > 0. 
(b) Show that 
tr WIT* = tr WL — al)? + ptr WN 4 p* tr W(M — BD”. 
(c) Prove (7.13.3). 


2. The relationship 
LMu — MLu = iu 


is called the Heisenberg commutation property. We saw that Q, and P, satisfy 
this property. (It can be shown that if L and M satisfy this property, thentheyare 
unbounded, see von Neumann [1].) 


14. THE HARMONIC OSCILLATOR 


The Hamiltonian differential equation for a harmonic oscillator in classical 
mechanics is 


di Oy: Gt: dee 
where H = y? + w’x’, x is position, y is momentum, and w is a physical constant. 
In the quantum-mechanical system we replace x by the position operator QO and 


y by the momentum operator P; compare with Section 12. The Hamiltonian 
function H then becomes the Schrédinger operator 


S =P? +0’Q’, 
or 
2 
Su= —-—,+’x*u 
dx? 

for —0o <x < oo. The operator S is, then, a singular Sturm-Liouville operator, 
since the interval is now infinite. We consider S' in the Hilbert space L,(— 00,00). 

Let us first show that S is symmetric on C)?(— 00,00), where C,* denotes the 
C? functions with compact support. If u, v are in Cy’, then by integrating by parts 
we get 


(Su,v) — (u,Sv) = [ [—u"(x) + w?x?u(x)]i(x) — u(x)[—8"(x) + w?x70(x)] dx 
zs { © Talons weds 


—_ { (+u'(x)b'(x) — u'Cx)b'(x)) dx = 0. 


544 ANALYSIS OF UNBOUNDED OPERATORS 


Hence, S is symmetric on Cy*(— 00,00). Furthermore, since S is real it has a self- 
adjoint extension, which we shall denote by S. Its domain, which includes Cy, we 
shall denote by J,. 

Our objective now is to construct an orthonormal basis of eigenfunctions for 
S. We shall see that this basis has the form 


2 
o,(x) = C, H,(,/@x) exp(—S*), n=0,1,..., 


where H, is the Hermite polynomial of degree » and c, is a normalization factor. 
Thus we seek solutions of 
Su = du, (7.14.1) 


where u #0 and ue L,(—o,0). If we make a change of variable, replacing 
wx by x and d by wA, Equation (7.14.1) becomes 


d7u 


are + x7u = du. (7.14.2) 


Let us now make another change of variable with u = v exp(—x*/2) and we sce 
that (7.14.2) becomes 


v” — 2xv' + Av = 0. (7.14.3) 
Let H,(x) denote the function 
H,(x) = (—1)" exp(x?) D" exp(— x’), (7.14.4) 


where D"v = d"v/dx". It is easy to verify that H,(x) is a polynomial of degree n. 
We will now show [see Equation (7.14.8) below] that H,(x) is a solution of (7.14.3) 
with A = 2n. 


7.14.1 THEOREM. The Hermite polynomials satisfy: 


A, (x) = 2x, (x) — An+1%); (7.14.5) 
H,,4.,(*) — 2xH,(x) + 2nH,,_ (x) = 0; (7.14.0) 
H,,'(x) = 2nH, - (x); (7.14.7) 
H,"(x) — 2xH,'(x) + 2nH,(x) = 0; (7.1-4.8) 
[HC n() exp(—x?) dx = Syn 2"! (7.14.9) 


forn=1,2,.... 


Proof: If we differentiate (7.14.4), we get (7.14.5). Now let a(x) = exp(_v’) 
Then by repeated differentiation we get 
D"*'a(x) + 2x D"a(x) + 2nD"~'a(x) = 0, (7.14.10) 


n=1,2,.... If we now multiply (7.14.10) through by (—1)"*! exp(x?), we pet 
(7.14.6). By combining (7.14.5) and (7.14.6) we get (7.14.7). If we now differentinte 
(7.14.5) and replace H,,. (x) by using (7.14.7), we get (7.14.8). 


7.14. THE HARMONIC OSCILLATOR 545 


In order to prove (7.14.9) we first note that for any polynomial P(x) one has 
P(x) exp(—x”) 0 asx—> +o. 


Now let m <n. Then, by repeated integration by parts we get 


{- H,,(x)H,(x) exp(—x*) dx = is H,,(x)(—1)"D" exp(— x”) dx 


=[ exp(—x?)D"H,(x) dx. 


Thus, if m<n, then D"H,,(x) = 0, and the integral above vanishes. If m =n, the 
last integral becomes 


n!d,, i: exp(—x?) dx =n! d, ./n, 


where d, 1s the coefficient of x” in H,(x). In the exercises the reader is asked to show 
thatd,=2". J 


If we now retrace our steps and use (7.14.9) we see that the Hermite functions 


5 ), (0 Pan ae 


bax) = (x20!) 7H, (x) exp( 


form an orthonormal collection of eigenfunctions for the operator [ —d?u/dx? + x?u] 
given in Equation (7.14.2). The corresponding eigenvalues are 4, = 2n. Also, 
{bq(./@x)} forms an orthonormal collection of eigenfunctions for the operator S 
and the corresponding eigenvalues are A, = 2nw™'. 

It remains to show that the collection {¢,(x)} forms a basis for L,(— 0,00). 
This is discussed in the exercises. 


EXERCISES 
1. Show that the coefficient of x” in H,(x) is 2”. LHint: Use (7.14.5) and mathematical 
induction. ] 
2. Show that H,(— x) = (—1)"A,,(x). 
3. Let G(x,t) = exp(2tx — t*). Show that 
2. 
G(x,t)= ) —H,(x)t". 
n=0 n! 
[G(x,t) is called the generating function for the Hermite polynomials. ] 
4. Find the first five Hermite polynomials. [Note: Ho(x) = 1.] 


5. Show that the Hermite functions form an orthonormal basis for L,(— 0,0). 
[Hint: Note that if fe L,(— 0,0), then f(x) = f(x) — fo(x) where f, and fo are 
even and odd functions. Now use the change of independent variables y* = x 
and the fact that the Laguerre functions form an orthonormal basis for L,[(0,00). 
See Exercises 8 and 12 in Section 5.18.] 


546 ANALYSIS OF UNBOUNDED OPERATORS 
6. Let {¢,: 2 = 0, 1, ...} be the Hermite functions on L,(— 00,0) and let 


Q= > n|(u,d,)|? < co}. 
n=0 
Define the creation and annihilation operators by A and A by 


Ad, =./Nnby-15 n=1,2,... 
Ado = 0, 


Ad, =./n + 1dna1, n=0,1,.... 


(a) Show that A* = A. 

(0+ iP), and A 
—- iP), = — 
/2 J2 


7. Using the notation of Example 2, Section 2, show that 


(b) Show that A = (OQ —iP) on &. 


fo @) ce fo) 1 

f. f_la@yP dx dy = Py, Qn—1? 
SUGGESTED REFERENCES 
Agmon [1] Jauch [1], [2] 
Bers [1] John and Schechter [1] 
Coddington and Levinson [1] Kato [1], [2] 
Courant and Hilbert [1] Lanczos [1] 
Friedman [1], [2] Meyers and Serrin [1] 
Garabedian [1] Mikhlin [1] 
S. Goldberg [1] Schwartz [1] 
Hellwig [1] Sobolev [1] 


Hille and Phillips [1] 


Also see references at end of Chapters 5 and 6. 


Appendices 


Appendix A The Hélder, Schwarz, and 
Minkowski Inequalities 


Appendix B_ Cardinality 


Appendix C Zorn’s Lemma 


Appendix D Integration and Measure 
Theory 


15. 


. Introduction 


The Riemann Integral 


A Problem with the Riemann 
Integral 


The Space Co 

Null Sets 

Convergence Almost Everywhere 
The Lebesgue Integral 

Limit Theorems 

Miscellany 

Other Definitions of the Integral 


. The Lebesgue Spaces L, 


Dense Subspaces of L,, 1< p< © 


. Differentiation 


The Radon-Nikodym Theorem 
Fubini Theorem 


Appendix E Probability Spaces and 
Stochastic Processes 


ee 


Probability Spaces 


Random Variables and Distribution 
Functions 


Expectation 
Stochastic Independence 
Conditional Expectation Operator 


Stochastic Processes 


548 
552 
556 


558 
558 
559 


564 
564 
566 
569 
572 
576 
S81 
586 
589 
59] 
593 
596 
598 


599 
599 


600 
602 
603 
604 
607 


Appendix A 


The Holder, Schwarz, 
and Minkowski Inequalities 


The purpose of this appendix, as the name suggests, is to prove the Holder, 
Schwarz, and Minkowski Inequalities. We will consider these inequalities in three 
settings: (1) finite sums; (2) infinite sums; and (3) integrals. Let us now state the 
inequalities for each of these settings. 

Schwarz and Holder Inequality 1<p< oo and p-'+qi=1. 


1. FINITE SUMS: 


1/q 


n n 1/p/on 
2% yl < (>? (>: bt 


2. INFINITE SUMS: Given that })%, |x;|? < oo and > 2, |y,|4 < 00, then 


ore) 00 1/p/ 
21% yil < (>: xt") (tvs 


3. INTEGRALS: Given that |g |x|? dt < 00 and |g |y|? dt < 00, then 


1/p 1/q 
[ Ixy dt< ({ xl? di] (| lat) 
Q Q Q 


The special case p = gq = 2 is often referred as the Schwarz Inequality. 


1/q 


Minkowski Inequality 1< p< o. 


1. FINITE SUMs: 


1/p 1/p 


(3 Ix de vi) < (>: |x") +f (5: bt) 


2. INFINITE SUMS: Given that })72, |x;|’ < oo and }}7, |y;|? < 00, then 


00 1/p 00 1/p 00 1/p 
(Sivtvr) <(S bar) "+ (Line) 


3. INTEGRALS: Given that |g |x|? dt < 00 and [ag |y|? dt < 00, then 


1/p 1/p 1/p 
( jx yi at) < (| x1? de) + (| 1? de) ; 
Q | Qn 


£AQ 


THE HOLDER, SCHWARZ, AND MINKOWSKI INEQUALITIES 549 


The above inequalities can be rewritten in a succinct form if we introduce the 
following norm notation: 


1. FINITE suMS: ||x||, = (S021 [x;|?)'/?, where x = (x,,...,x,). 
2. INFINITE SUMS: ||x||, = ()721 [x;1?)'/?, where x = (x1,x2,...). 
3. INTEGRALS: ||x||, =(Jq |x|? dt)'””. 


The Minkowski Inequality then becomes 


Ix yilp< llxl,p+ lvl. 


where x+y denotes either (x, + y,,...,%,+ ),), OF (%; EV1.%2 £y25.--), OF 
x(t) + y(t) as the case may be. 
In order to prove these inequalities we will use the following lemma. 


A.l Lemma. Let a and b be nonnegative real numbers. Then 


Pp q 
ab +e, (A.1) 


where 1 <p<oandp'+qi=1. 


Proof: In the (é,y)-plane consider the curve 4 = €?~', or equivalent, 
E=n'', Let 
a Pp b b? 
A, = [ ertde=— and = Ap = [ n! dy =—. 
0 p 0 q 
If we interpret A, and A, as areas, as shown in Figure A.1, then it is clear that 
ab<A,+A,. J 


Figure A.1. 


550 APPENDIX A 


The proof of the Hélder Inequality for finite sums is now easy. Since! 
Ixil Lyi Z x;|? Lyilt 
7 lyllq Pilxll,? — (A.2) 


[x;| il 1 
|x; ly |? = = + -=1. 
i=1 Xl, ya es 5 DH oak py q 


Hence, 
Yb = py xl Lyd < lolly yg: 


The proof of the Holder amis for infinite sums is a straightforward 
extension of the result for finite sums. Indeed, 


N N 1/p/N 
21% yl Ss (> bx?) (> butt 


a0 1/p/ 1/q 
< (Se) “(5 tt) 


Now let N— oo on the left side and one gets the Holder Inequality for infinite 


sums. 
The proof of the Hélder Inequality for integrals is similar. Here we replace 


(A.2) with 


1/q 


COL IYO! — OI lO 
< 5 
IxIp yia = Pix? @ ly lla’ 
and then integrate to get the desired result. 

Minkowski Inequality follows from the Hélder Inequality. We will present 
here the argument for finite sums and ask the reader to verify that the same 
reasoning applies to infinite sums and integrals. 

First note that 

(la| + |b])? = (lal + [5))?~*la] + (lal + [5|)?~* 1]. 


Now set a = x; and b = y; and sum over i. Then 


2, bx + yl? < Py (\xil + Lil)? 


= 


= Limi + yi?” ‘Lal + Y (led + Lyd ‘lyil- 


Now apply the Hélder Inequality to each of the sums on the right side of the above 
equation. Since (p — 1)q = p, one gets 


n n 1/q n 1/p 
Yad + bd? < (Ldxl + bd) (3 bs?) 
1/p 


+(Sdst+ inp) (Sb) a 


1 We assume here that ||x||, #0 and |ly||, #0. If one of these happened to be zero, the Hates 
Inequality is trivially true. 


THE HOLDER, SCHWARZ, AND MINKOWSKI INEQUALITIES 551 


If ("1 (\x;l + |y1)?)'4 #0, we can divide both sides of (A.3) by it, thereby 
getting 
i/p 


n 1/p n 
(3: Ix aE v1) S (> (ls + yi) 
n 1/p n 1/p 
<(Ybr)+(Sine) 


which is the Minkowski Inequality. If (O°, (\x;| + |y,|)?)'4=0, then the 
Minkowski Inequality is trivially true. 


There is also a Hoélder Inequality for the case p= 1 and g= o as well as a 
Minkowski Inequality for p = oo. In both cases the proofs are elementary. 


Holder Inequality p=1,qg= ©. 
|. FINITE OR INFINITE SUMS: Given ) ; |x;| < oo and sup; |y;| < oo, then 
¥ Isil-< (¥ xl) (sup, yl). 
2. INTEGRALS: Given fg |x| dt < oo and ess., sup |)(t)| < 00, then 
[ lxyl drs ([ [xl at)(ess. sup Iy(D, 


where ess. sup is defined in Appendix D. 


Minkowski Inequality p= ©. 


1. FINITE OR INFINITE SEQUENCES: Given sup, |x;| < oo and sup; |y,| < 00, then 


sup |x; + yil < sup |x,| + sup |y;]. 
t t t 
2. FUNCTIONS: Given ess. sup |x(t)| < oo and ess. sup | y(t)| < 00, then 
t t 


ess. sup |x(t) + y(t)| < (ess sup 1} + (ess sup 0) 


Appendix B 
Cardinality 


Let X be a set. Then card (XY), the cardinal number of X, is merely the number 
of elements in X. For finite sets, this is, of course, a rather elementary idea. For 
infinite sets. the concept of cardinal number is a bit more complicated. 

We will not define cardinal number here.! Instead we shall define when one 
has 


card (X) = card (Y), 
card (XY) < card (Y), 
card (XY) < card (Y). 


These definitions will be adequate for our purposes. 


B.1 DEFINITION. Let X and Y be two sets. We shall say that 
card (X) = card (Y) 
if there is a one-to-one mapping of XY onto Y. We say that 
card (X) < card (Y) 


if there is a one-to-one mapping of X into Y. If we have card (XY) < card (Y), then 
we say that 


card (X) < card (Y) 


if every one-to-one mapping @ of X into Y is not onto, that is, the range @(X) ts a 
proper subset of Y. 

Before we look at some examples, there is one very important theorem which 
we should prove. 


B.2 THEOREM. (BERNSTEIN.) Let X and Y be sets. If card (X) < card (Y) and 
card (Y) < card (X), then card (X) = card (Y). 


Proof: At first glance it may appear that there is nothing to prove. Howeve1, 
the theorem is not that trivial. We are given one-to-one mappings (see Figure B.1!) 


f:X—7Y — and g: YX. 


(Remember that neither mapping may be onto.) The problem is then to construct 
a one-to-one mapping h of X onto Y. 


1 A definition can be found in Wilder [1, p. 99]. 


§52 


CARDINALITY 553 


Figure B.1. 


Let Y, = /(X),X, = 9(Y),and X, = g(Y,). If one had Y, = Y, or equivalently, 
X, = X,, then f would be a one-to-one mapping of XY onto Y. On the other hand, 
if we could construct a one-to-one mapping k of X onto X,, then 


h(x) = 97 *(K(x)) 


would be a one-to-one mapping of X onto Y. We now proceed to construct such 
a mapping k. But first recall that the composition g(f(x)) is a one-to-one mapping. 
Now define sets X, and X, for n = 1, 2,..., as follows: 


= g(Y), X, = 9 f(X)) 
X,= ay _~ i)) X, = 9 S(%)) 


Xntt = ft.) Xnt1 = 9(f(X,)). 


Since XY, < X,, we see that ¥, < X, for all n. Furthermore, since X, < X we see 
that X, < X,. Therefore, X,., ¢ X, for all n. 
Let X, =X and now define a mapping k: X > X as follows: 


k(x) =9(f(x)), ifxe X,—-Xnay, n=0,1,..., 

k(x) = x, ifxe X, — X,, a2) ee eee 

k(x) = x, ifxe(\m1 X,. 
It is easy to see that from the way the sets X, and X, were constructed, k maps 
(X, —X,41) onto (X,4, —X,42) forn =0,1,.... Since the composition g(f(x)) is 


one-to-one over each of these sets, we see that k is a one-to-one mapping on_X. 
Furthermore, it is a simple exercise to see that the range of kK is 


ue = Ryn 0 Ry XY (A Xn) = 2 


Hence A(x) = g™ '(k(x)) is a one-to-one mapping of X¥ onto Y. J 


554 APPENDIX B 


EXAMPLE 1. Consider the sets 
y fea, erepcendl 0 RO eee 
NPD orcas 
QO = rational numbers, 
Z x Z = Cartesian product of Z. 


Any set X with the property that card (X) < card (N) is said to be countable. 
When card (X) =card (NV), we sometimes say that X is countably infinite. If one 
has card (NV) < card (X), then X is said to be uncountable or uncountably infinite. 

It is easy to show that card (Z) = card (NV). We ask the reader to do this. 

Let us show that card (Z) = card (Z x Z). The mapping n— (n,1) defines a 
one-to-one mapping of Z into Z x Z so we see that card (Z) < card (Z x Z). A 
One-to-one mapping of Z x Z into Z is suggested by Figure B.2. This shows that 


Figure B.2. 


card (Z x Z) < card (Z). Hence card (Z) = card (Z x Z). 
Let p/q denote a rational number, where p and g have no common divisors, 
Then the mapping p/q — (p,q) defines a one-to-one mapping of Q into Z x Z so 


card (Q) < card (Z x Z) = card (Z). 
Since card (Z) < card (Q) (Why?) we see that card (Z) = card (Q). J 


EXAMPLE 2. Let X by any nonempty set and let P(X) denote the collection 
of all subsets of X. One can show that 


card (X) < card (P(X)), 
see Wilder [1, pp. 102-103]. For finite sets the last inequality becomesn <2". J 


EXAMPLE 3. Let J be the real interval [0,1). Let us show that 
card (NV) < card (J). 


In order to do this let @ be a one-to-one mapping of N into J. We want to show thut 
there is a number re/ but r ¢ d(N). 


CARDINALITY 555 


For each ne N, f(n) 1s then a real number and we express it in terms of its 
decimal expression.” 
Then 


gp(1) = Q. d,' d,' d,' d,} ria 
p(2) = 0. d,? d,* d;’ a 


p(n) = 0. d," d," d," dz" ..., 
where d;” represents one of the integers 0, 1,..., 9. Let 
r=O0.7r,', 134... 
be a real number chosen so that 
i ee a ae oa ee 
Then r 4 (1), r 4 $(2), .... In other words, re J, butr¢ @(N). J 


EXERCISES 

1. The mapping n—n defines a one-to-one mapping of N, = {2,4,6,...} into 
N = {1,2,3,...}. Explain why this does not prove that card (N,) < card (N). 

2. Show that card (R) = card (R”), where R denotes the real line 


3. One can define a finite set as follows: A set X is finite if every one-to-one map- 
ping @ of X into itself is necessarily onto, that is, (X) = X. Using this definition, 
show that the union of two finite sets is finite. Also, show that N = {1,2,...} 
is not finite. 


4. Show that the countable union of countable sets is countable. That is, if X, is 
a countable set for ne N, then X = | )*, X, is countable. 


5. Show that card (XY) = card (X x X) for any infinite set X. 


6. Assume that Y is a nonfinite set and also that card (XY) < card (Y). Show that 
card (X x Y) =card (Y). (This fact motivates the equality 


card (X) + card (Y) =card (Y), 


which occurs in cardinal arithmetic.) 


2 We agrce not to use a decimal expansion that terminates in an infinite sequence of 9's. 


Appendix C 
Zorn’s Lemma 


The phrase “‘ Zorn’s lemma’”’ is a misnomer. It is really an axiom that 1s 
used oftentimes in analysis. It is actually equivalent to at least eight other similar 
statements, including the Axiom of Choice, see Kelley [1, pp. 31-36]. 

In this appendix we shall formulate Zorn’s lemma, as well as the Axiom of 
Choice. We shall also give one illustration of how Zorn’s lemma can be used as u 
logical tool. 


C.1 DEFINITION. A relation R on a set X is a subset R of the Cartesian 
product X x X. We say that x is R-related to y, or xRy, if (x,y) belongs to R. 


In this appendix we shall be concerned with “‘ orderings.”’ These are relations 
with additional properties which we prescribe below. For this reason we shall 
replace xRy with x < y. 


C.2 DEFINITION. A relation = on X is said to be partial ordering if (|) 
x =x for x € X; (ii) if x <y and y<x, then x =); (iti) x Sz whenever x Sy and 
y <z.A partial ordering is said to be a total ordering on X if, in addition, (iv) for 
any two points x, y€X one has either x Sy or ysx. A set X is said to be 
partially ordered if it has a partial ordering defined on it. A subset Z of a partially 
ordered set X is said to be a chain if for any two points x, y € Z one has either 
x<y or y<x, that is, the restriction of the partial ordering relation to Z yields i 
total ordering. 


C.3 DEFINITION. Let < bea partial ordering on a set X and let A c XY. We 
shall say that a point x € X is an upper bound for A if one has a< x for allae 4. 
If the set X itself has an upper bound X, we shall call ¥ a maximal element for X. 


We can now state Zorn’s lemma. 


ZORN’S LEMMA. If every chain in a partially ordered set X has anupper bouni, 
then X has a maximal element. 


Although we shall not offer a “proof,’’ it is interesting to note that Zorn's 
lemma is equivalent to the following: 


AXIOM OF CHoICcE. Let A be an index set. For every ae A, let X,, denote some 
nonempty set. Then there is a function f defined on A such that f(a)eE X,, for all 
ae A, 


556 


ZORN’S LEMMA 557 


Let us now illustrate how Zorn’s lemma can be used. 

Let A be a linearly independent set in a linear space Y. Let us show that there 
is a Hamel basis H c Y that contains A. Let X denote the collection of all linearly 
independent sets B in Y with the property that Ac B. We say that B, < B, if 
B, < B,. It is easy to see that < is a partial ordering on XY. Let Z = {B,;:ie]} 
denote a chain in X. (Here J denotes some index set.) Let B = | );,; B;. If we can 
show Be X, itis clear that B is an upper bound for Z. In order to show that Be X 
we must show that A c B (this is, of course, obvious) and that B is a linearly 
independent set. Let {y,,...,y,} be any finite collection from B. Then for some 
indices {i,,...,i,} one has y, € B;, . Since Z is a chain the sets {B, ,...,B;,} are com- 
parable in the sense that either B, © B; or B; & B;,. Say that the indices are 
chosen so that 


B,, SB, S °° SB,.. 


by TT 82 


Then {y,,...,¥,} belongs to B;. Since B;, is linearly independent, we see that the 
only solution of 


OV, + tee), =O 


is @, =4,=°': =a, =0. Hence, B is linearly independent. 
We see then that Z has an upper bound, so by Zorn’s lemma X has a maximal 
element H. It is now easily checked that H is a Hamel basis for Y. 


EXERCISES 

1. Show that a total ordering on a set is an equivalence relation (by Chapter 2). 
What about the converse ? 

2. Give an example of a partial ordering that is not a total ordering. 

3. Give an example of a relation that is not a partial ordering. 


Appendix D 


integration 
and Measure Theory 


1, INTRODUCTION 


One of the most important concepts in the world of mathematics is that of 
the integral. In its most primitive form the integral was used by the early Greeks 
in their development of Euclidean geometry. For example, the problem of deter- 
mining the area of a region as simple as the inside of a circle was solved by means 
of an integration process, that is, by summing the areas of disjoint rectangles 
contained in the circle. However, it was not until after Descartes’ work in 1637 on 
analytic geometry that mathematicians could begin to view the integral as an 
object of analysis. 

Descartes’ work paved the way for the discovery of the calculus by Leibniz and 
Newton around 1665. At that time a big argument ensued as to who discovered 
the calculus first, and the mathematicians of Germany and England split off into 
warring camps each with their respective champion. It is now believed that Newton's 
work slightly preceded the work of Leibniz. However, it is the notation and the 
viewpoint of Leibniz that was adopted by the mathematical world, and his symbols 
“{” and “d” are still used today. 

Leibniz developed the calculus using the concept of the “infinitesimal.” At 
first there was much confusion over the “‘ nature of infinitesimals’’ and the question 
of “adding infinitesimals.’’ However, today this is well understood. 

It was Cauchy and Riemann who first gave a systematic definition of an 
integral in the first half of the nineteenth century. This integral is now named alter 
Riemann. Around the turn of the century, Lebesgue, in his doctoral dissertation, 
gave a more general treatment of integration which has resulted in a second revo- 
lution in the field of analysis. One of the objectives of this appendix is to develo) 
the theory of Lebesgue integration. In this development we shall be particularly 
interested in the relationship between the Lebesgue and Riemann integrals. 

The difficulty with the Riemann integral is that the space of Riemann integrable 
functions is not complete, when considered as a metric space. One can view thie 
space of Lebesgue integrable functions as the completion of the space of Riemann 
integrable functions and the Lebesgue integral as an extension of the Riemann 
integral. In fact, it is possible to define the Lebesgue integral in exactly this fashion, 
but we will not do that here. We will use instead the Daniell approach in developinyp 
the Lebesgue integral. Our starting point will be based on knowledge of certum 
elementary properties of the Riemann integral for continuous functions. We will 
define the Lebesgue integral in terms of the Riemann integral. It will be evident 
from this approach that the Lebesgue integral and the Riemann integral of a con 
tinuous function are the same. 


D.2. THE RIEMANN INTEGRAL 559 


There are other ways of defining the Lebesgue integral which also use the 
Daniell approach. They differ from what we will do here in that they begin from a 
different starting point. These other methods are important and we shall discuss 
them in Section 10. 

It must be remembered that the Lebesgue integral is developed primarily to 
satisfy certain theoretical questions. Its importance lies in the structure as described 
by the limit theorems of Section 8. We do not pretend that it is always easy to 
compute with the Lebesgue integral. This is not our purpose. 


2. THE RIEMANN INTEGRAL 


Assume that we are given a bounded function f(t) defined on a finite closed 
interval a<t<)b; that is, m< f(t) < M for all ¢ in the interval, see Figure D.2.1. 
We can assume that M =sup{f(t):a<t<b} and m=inf{f(t):a<t<b}. A 
partition P of the interval [a,b] is a finite collection of points {t9,...,¢,} such 


LZ N f 
| AN 
iN 


Figure D.2.1. 


that a=) <t, <-+- <t,=5). Since fis bounded on [a,b] tt is also bounded on 
the ith subinterval [t;_,,t;]. Consequently the numbers 


M,=sup{f(t): t;-, <t <1; 
m; = inf{f(t): t;-; <t< tj} 
do exist, andm<m,<M,<M for alli=1,2,...,”. Let At; = ¢; — ¢;_. 


For each partition P we can form the upper and lower sums: 


U(P.f) = 5M, An, 


LP.) = Ym, At. 
a 


560 APPENDIX D. INTEGRATION AND MEASURE THEORY 


The functions U, L satisfy the following inequalities: 
mb — a) < L(P,f) < U(P,f) < M(b — a). 


Thus for f fixed, U(P,f) forms a set of real numbers which is bounded above and 
below. Consequently, this set has an infimum (and supremum). So we define the 
upper integral of f by 


[ Garsine UP: 


where the inf is taken over all possible partitions of the interval [a,b]. Similarly, we 
define the lower integral by 


{fa = sup L(P, f/f). 


One can show that for all bounded functions f one has 


[ farsf fat (D.2.1) 


To prove (D.2.1), one must use the fact that if P and Q are two partitions of [a,b], 
then 


L(P,f) < U(Q,f). (D.2.2) 


However, (D.2.2) can be established by using the notion of a refinement of a 
partition. This argument is outlined in Exercises |-—2. 

The inequality (D.2.2) shows that for each partition P, L(P,f) is a lower 
bound for the family {U(Q,f)}, where Q is any partition of [a,b]. Consequently, 
it is smaller than the greatest lower bound, or 


L(P,f) < f fdt. (D.2.3) 


Finally, (D.2.3) shows that the upper integral is an upper bound for the family 
L(P,f), where P is any partition of [a,b]. Since {? f dt is the least upper bound for 
the family it follows that (D.2.1) is true. - 

If it happens that the equality holds in (D.2.1), then we say that fis Riemann 
integrable (R-integrable) and the integral of fis defined to be the common value 


[rar=fofar= [fae 


We shall let 2 = &([a,b]) denote the class of R-integrable functions on the interval 
[a,b]. 

Summarizing, then, we see that every bounded function has an upper and lowet 
integral. A certain subclass @ (where the two integrals are the same) is the clin 
of Riemann-integrable functions. 

The first question that naturally arises is to determine which functions lie in 
&. A partial answer to this question is given in the following existence theorem, 


D.2. THE RIEMANN INTEGRAL 561 


which are usually proven in an elementary calculus course. (See Exercise 5.) 
A complete answer is stated in Exercise 2, Section D.9. 


D.2.1 THEOREM. Let f(t) be continuous fora<t<b. Then fe &. 


Let us summarize some of the elementary properties of the Riemann integral. 


D.2.2 THEOREM. 
(a) If fe B([a,b]) and a<c<b, then fe ([a,c]), fe B ([c,b]), and 


b c b 
| far= | fats | fae. 
(b) Iff,,f2€ and a,,«,€R, then f=a,f, + %2f,€R and 


[ra =a er dt +a, fh dt. 


(c) If fe and f(t) = 0 on [a,b], then {*f dt = 0. 
(d) Iffe RB and m < f(t) < M on [a,b], then 


b 
m(b — a) < { fdt < M(b—a). 
(e) If f is continuous on [a,b], then there is a € € [a,b] such that 


[ys dt = f(b — a]. 

(f) If fe, then|f| eB, where |f\(t) = |f(o)|, and 
| f fdt| < f fldt. 

(g) If f,g€ and fit) < g(t), then (°f dt < f°g dt. 


Since these facts are proven in most books on elementary calculus, we shall not 
discuss them here. 

Before proceeding further, we should note that there are (bounded) functions 
which are not Riemann integrable. Indeed, 


l, if ¢is rational, 


i= 0, if ¢ is irrational, (Oe) 


is one such function. For if P is any partition of the interval [a,b], then M, = 1 
and m;=0 for all i. Consequently, U(P, f) = 1(b — a) and L(P, f) = 0(b — a). 
Thus {? fdt=(b—a) and {? fdt=0. We do show later that this particular 
function is Lebesgue integrable. 

While the above definition of the Riemann integral is satisfactory from a 
theoretical point of view, it 1s not practical for computation. We need some 
additional information for this purpose. 


562 APPENDIX D. INTEGRATION AND MEASURE THEORY 


Let P:a=t) <t, << °°: <t,=b be a partition of [a,b]. Define the norm of 
P by 


|P| = es At = eh |t; — t;-,|. 
1i< sisn 


D.2.3 THEOREM. Let f be an R-integrable function on [a, b]. Let {P,,} be a 
sequence of partitions such that |P,,| ~0 as m— oo. Then 


L(Pmsf)-> | fat 


and 


U(Pm of) | fat 
as m— ©. 


In applying this theorem, one quite often chooses P,,,,, to be a refinement of 
P.,» that is, the set P,,,, contains the set P,,. In this case, the sequences {L(?,,,,/)} 
and {U(P,,,,/)} are monotone. In fact, if P,, is the partition 


P,»LIA@=tp<t, <0 <t,=)b 
and the functions U,, and L,, are defined by 
U_(t) = a [= 4, 


t»<t<t,,i=1,2,...,n, 


L,(t) = i (a), t= 4, 


t_1.<t<t,,i=1,2,...,n, 
then one can easily show that 
Lw<sLt)< °°: <f@O<::: < Ut) < U2). 
Therefore, 


L(Py.f) <L(Pa.f)<-+'S | fdt<-+- < U(P,,f) < U(Py,f). 


A special case of this occurs when the partition points are equally spaced. 


D.2.4 COROLLARY. Let f be a Riemann-integrable function on [a,b]. Further, 
let P,, be the partition formed by decomposing [a,b] into m equal parts, that Is, 
Pry = {to stis-++ stm}, Where t; = a+ i(b — a)/m. Then 


L(Pqsf) | fat 


UPaf)> | fat 


D.2. THE RIEMANN INTEGRAL 563 


EXERCISES 


J, 


Let P and Q be partitions of the interval a < t < b. We say that P © Q is every 
point of P is a point of Q. Show that for every bounded function f one has 


L(P,f) < L(Q,f) < U(Q,f) < U(P,f) when Pc Q. 


. Use the result of Exercise 1 to prove (D.2.2). [Hint: First show that if P and QO 


are two partitions, then there is a partition P that satisfies P< P, O < P.] 


. Use the Corollary D.2.4 and mathematical induction to show that 


b 
| t dt = 4b’. 
¢) 
[That is, if P,, 1s defined as in Corollary D.2.4 and f(t) = t, show that 
b? wm b>m+1_ b? 
U(Pa sf) = “5 2 i arrmee s 


as m—> o.] 


. One can also show that any bounded monotone function has a Riemann 


integral. Let f be a bounded increasing function on [a,b]. If P = {to,t;,...,t,} 
is any partition and M, and m;, are defined as above, then M; =/f(t;) and 
m,=f(t;-,). [If the points of P are equally spaced, then U(P,f) — L(P,f) 
= [f(d) — f(a) At, where At = t; — t;_, = (6 — a)/n.] Show that there is a sequence 
of partitions {P,,} such that U(P,,,f) — L(P,,,f) 70. Use this to show that f 
is R-integrable. 


. The following steps will lead to a proof of Theorem D.2.1: For a<t<b 


define 7 
G(t) = { fdt, H(t)= { f dt, 


where f is continuous. 
(a) Show that fora<t<t+A<b one has 


G(t + h) — G(t) = [pa =f(t*)-h 


for some ¢* with t < ¢* <t+A, and 


atth 


H(t +h) — H(t) =| fdt=f(t**)-h 
t 
for some ¢** with ¢ < ¢** <t+h. 


dG 
h asim sealer 
(b) Show that — =——-=f 


(c) Show that G(t) = H(t) for all t, a<t <b. That is, 


(=a 


564 APPENDIX D. INTEGRATION AND MEASURE THEORY 


3. A PROBLEM WITH THE RIEMANN INTEGRAL 


The Riemann integral does have at least one serious shortcoming. That is, 
it is possible for a sequence of functions, {f,} c Z to converge to a function f, 
which may even be bounded, but which is not R-integrable. For example, let 


Lif k 
x am jis } t=—,kan integer 
F(t) = lim [cos (n! nt)]°" = n!} 


0, otherwise 


forn=1,2,.... Then f(t) > f(t), where 


1, if ¢ is rational, 
Kf) = 0 if ¢ is irrational, 


which is the function given in (D.2.5). As shown above, f(t) is not in #2. However, 
f,€ BF and [$f,(t) dt = 0 for all n. 
Although f¢ &, it would seem natural to define [$f(t) dt by the relationship: 


lim { “f£(t) dt = { {tim f.(0) dt = { F(t) dt. (D.3.1) 


n~o*a a 


In other words, it seems natural to enlarge the class of integrable functions by 
defining the integral of f, the limit of a sequence f, , by means of (D.3.1). It is shown 
below that this leads to a consistent notion for the integral, and the new class of 
integrable functions are the Lebesgue integrable functions. 

Actually we shall not use sequences as the vehicle for defining the Lebesgue 
integral, but instead we shall use infinite series. Since sequences and series are 
correlative concepts, it really does not matter which concept we use to extend the 
integral. We choose to use series only because certain technical conditions are 
easier to formulate in this context. 


4. THE SPACE Co 


We begin our construction of the Lebesgue integral by isolating the properties 
of the Riemann integral that are essential for our theory. First we let Cy = Co(R,R) 
denote the class of all continuous real-valued functions defined on R with compact 
support. This means that @ € Cy if and only if @: R > R is continuous and there ts a 
bounded interval [a,b], which may depend on @, such that ¢(t) = 0 for ¢t ¢ [a,b]. 
The graph of a typical function in Cy) appears in Figure D.4.1. It follows from 
Theorem D.2.1 that for any continuous function @¢, the integral {°o dt is defined 
and therefore we define | ¢ dt by 


foa=[- pat=[ oadt, (D.4.1) 


where ¢ vanishes outside [a,b]. The integral | @ dt is then defined for every function 
in Co. 
The space Cy has an important property. 


D.4. THE SPACE Cy 565 


Figure D.4.1. A Function in Co. 


(1.1) If @ and W are two functions in Cy, and « and B are two real numbers, 
thnodv w,o A Ww, and ad + Bw are in Cy, where 


(p v W)(t) = max [O(t), YO) ], 
(PA W(t) = min [6(2), YO], 
(ap + BY)(t) = aP(t) + BY). 
In addition to property (1.1), the integral | @ dt satisfies the following 
properties: 


(1.2) If @ and wh are two functions in Cy and « and B are two real numbers, 
then 


[(ad + By) dt =a [pdt +B fy ade. 
(1.3) If is a function in Cy with ¢ = 0, then | o dt = 0. 


(1.4) (Dini’s Theorem) Jf {@,} is a decreasing sequence of nonnegative 
functions in Co (that is, 0 < y+, < $,) and lim,..,, ,(t) = 9 for all t, then 


lim [$, dt =0. 


n-> co 


We shall give a proof of Dini’s Theorem shortly. Before we do this though, let 
us examine some of the consequences of Properties (I.1), (1.2), and (1.3) which 


will be used below. Let ¢ € C,. It follows that 
og =¢vV0, 6 =(-4)v0=-(@A)), [=o +7 
are also in Cy. Furthermore, 


fea =|for ar— fora < [ot dt+ [o> dt= [\dl at. 
Also, if @, w are in Cy and @ < y, then 0 < w — ¢ so that 


0s [(y-o)dt= [yar [pat that is, [¢adt< [y at. 


566 APPENDIX D. INTEGRATION AND MEASURE THEORY 


Let us now prove Dini’s Theorem. Since ¢, € Cy, there is a bounded interval 
[a,b] with the property that ¢, vanishes outside [a,b]. Since 0 < ¢, < ¢, for alln 
we see that ¢, vanishes outside [a,b]. Hence 


(4, at= [4,41 


for all n. Since { ¢, dt = 0, we want to show that for every ¢ > 0, there is an N such 
that 


| ¢, dt < e(b — a) 


for alln > N. This will be accomplished once we show that for every ¢ > 0, there is 
an N (independent of t) such that 


o,{t)<té, a<t<b, n2=N. (D.4.2) 


If (D.4.2) were not true, then this would mean that we can find an é9 > 0 such that 
for every N one can find ann > N and a¢, in [a,b] such that 


Eo < >, (t,). (D.4.3) 


Since the interval [a,b] is (sequentially) compact (Theorem 3.17.14) we can find a 
convergent subsequence of {t,}, call it {¢,-} and let tg =lim#,,. Since @,(to) > 0 
as n’ > co we can find an M such that $y,(to) < &/4. It follows from the continuity 
of dy that there is a 6 > 0 such that $y(t) < é9/2 for all ¢ satisfying |t — fo] < 0. 
Since the sequence {@,} is decreasing, this implies that 


be SZ, |e tol <4, 0 > M. (D.4.4) 


By combining (D.4.3) and (D.4.4) we see that |Z, — fo| = 6 which contradicts the 
fact that t) = lim #,,; therefore, (D.4.2) is true. J 


We ask the reader to note that the theory of the Lebesgue integral, which we 
now turn to, depends only on properties (I.1), (1.2), (1.3), and (1.4). This ts im- 
portant because if one replaces Cg with another class of functions, and if one 
replaces the Riemann integral | ¢ dt with another integral, and if one does it in 
such a way that properties (I.1), (1.2), (1.3), and (1.4) still hold, then one can mimic 
the theory we now present and extend the integral to a larger class of functions. 
This approach is what is called the Daniell approach. We will return to some 
variations on this theme in Section 10. 


5. NULL SETS 


The first step in the construction of the Lebesgue integral is to introduce a 
weaker form of convergence, namely ‘“‘ convergence almost everywhere.” This will 
be defined precisely in the next section, but it means roughly that a sequence of 
functions converges everywhere except for a negligible set, or null set. In this 
section we wish to study the concept of a “ null set.”’ 


D.5. NULL SETS 567 


D.5.1 DEFINITION. A set Ec Ris said to bea null set if there is an increasing 
sequence {d,} in Cy such that {¢,(t)} diverges for each t in E while 


lim [$, dt < +00. 


The following theorem gives a characterization of null sets in terms of infinite 
series. 


D.5.2 THEOREM. The following statements are equivalent: 


(a) E isa null set. 


(b) There is a sequence {W,} in Co such that ),, W,(t) diverges for each t in E 
while ¥.,, | |W,|dt < 0. 


(c) There is a sequence {,} in Co such that )\, |,(t)| diverges for each t in 
E while ¥', | |W, |dt < 00. 


Proof: (a)=>(c). Assume that E is a null set and let {@,} be a sequence 
satisfying the conditions of Definition D.5.1. Let ¥, = 4,1, —@,. Then w, = 0 
and )"_1 W, = Gm+1 — ;. Hence >, w,(t) diverges for te E. Also, 


Y, flval dt= fbn $0) dt = fbmer dt — fo dt 


Hence, 


Y fival dt = lim {4, dt — [1 dt <0, 


(c) > (b). Obvious. 

(b) => (a). Let {w,} be a sequence in Cy, where ),, w,(t) diverges for each 
tin E and Y., { |W,|dt < oo. Now let ¢,, = m1 |W, |. It is easy to check that {¢,} 
satisfies the conditions of Definition D.5.1, hence Eis a null set. J 


EXAMPLE |. Let Econsist of a single point {p}. Then E ts a null set. To see 
this we simply let ,, be a function in Cy that satisfies ,(p) = 1 and f |p,|dt = 1/n’, 
as shown in Figure D.5.1. Since ),, W,(p) = +00 and >), J |W,|ldt < oo, we see that 
{p} is a null set, by statement (b) in the last theorem. J 


1/n2 Pp 1/n2 


Figure D.5.1. 


568 APPENDIX D. INTEGRATION AND MEASURE THEORY 


EXAMPLE 2. Let F,, F,,... be a sequence of null sets. We claim that the 
union E = | )r_, E, is a null set. In order to see this we let {¢,,} be a sequence in 
C, that satisfies 


> |Oax(t)| = +00, for all tin E,, 


» [lb dt = M, < 0. 


Now let ,, = 27"M;,'o,,, which is a sequence in Cy. If te E, then te £, for 
some k and 


y Wat) = y,2 *M,! » lPaxtt)| = +0. 


Furthermore, 


Y [nel dt = 2°! YD [lbw dt = 27" < oo. 
n,k k n k 
It follows that Eis a null set. J 


By combining Examples 1 and 2 we see that any countable set is a null set. In 
particular, the set of rational numbers is a null set. In Section 9, we shall discuss 
the Cantor set. This is an example of an uncountable set which is a null set. 


EXAMPLE 3. Let us now give an example of a set that is not a null set. 
Let E be any nontrivial interval with end points a < b. Now choose «, f so that 
a<a< pf <b. Now let € be a nonnegative function in Co that satisfies 


_ N, tin [o,f], 
c(t) = 0, outside [a,b], 


as shown in Figure D.5.2. If E were a null set, then we could find an increasing 
sequence {@,} in Cy such that ¢,(t) > + 00 fora<t<b, and lim,.., | ¢, dt < 00. 


Figure D.5.2. 


Now set 
Wi me (¢ Ss h,)*. 


It is easy to see that {w,} is a decreasing sequence and that lim,.,, W,(¢) =0 for 


D.6. CONVERGENCE ALMOST EVERYWHERE 569 


all t. Hence by property (1.4) we see that lim{ wy, dt=0. By using the fact 
that (¢ ~~ Pn) < (¢ a= bn)" _ Wns we get 


B B B 
NB-a)— |b. dt= J (E—d,)dt= J LE - 4) + (bn — ds)] at 
B 
< | Wat $n $1) at 


< [nt bn — $1) at, 


since @, — ¢, > 0 and yw, > 0. Now let n — o and we get 
N(B —«) < lim i , dt. 


Since N is arbitrary, this implies that lim,.,. | ¢, dt = +00, which is a contra- 
diction. Hence £ is not a null set. J 


EXERCISE 


1. Let 7,, k =1,2,..., be a sequence of intervals with end points a, — b,. Define 
the total length of the sequence {J,} by ) °°, (5, — a). A set E < Ris said to be 
a set of measure zero if for every ¢ > O there is a sequence {J,} of intervals with 
total length <e and such that Ec |), J,. Show that a set E c Ris a null set if 
and only if it is a set of measure zero. 


6. CONVERGENCE ALMOST EVERYWHERE 


We will see soon that from the point of view of integration theory, null sets 
can be ignored, that is, they are negligible. In order to make this precise, we need the 
following definitions. 


D.6.1 DEFINITION. If some property holds for all real numbers ¢ outside 
some null set, then we shall say that this property holds almost everywhere (which 
is usually abbreviated ‘‘a.e.”). For example, a sequence of functions {f,} is said to 
converge to f almost everywhere, that is, 


limf,=f (a.e.), 


if the set of real numbers 7, for which {f,(t)} fails to converge to f(t), is a null set. 
Similarly we say that 


VSf() =F (a.e.), 


if the set of real numbers ¢, for which )., f,(t) # F(t), is a null set. Also f < g (a.e.) 
means that the set {t: f(t) > g(t)} is a null set. 

The first consequence of this concept is that we can prove an “almost every- 
where’ form of Dini’s Theorem. 


570 APPENDIX D. INTEGRATION AND MEASURE THEORY 


D.6.2 THEOREM. Let {¢,} be a decreasing sequence of nonnegative functions in 
Co such that 


lim ¢, = 0 (a.e.). 


n- o 


Then lim,.... | ¢, at = 0. 


Proof: Let €>0 by given. We will now show that lim,.,, | ¢, dt <«. 
First we note that since {@,} is decreasing and @¢, > 0, the limit lim,..,, ¢,(4) 
exists for all ¢. Next let E be the null set 


b= f :lim ¢,(t) # o} 


n-> oo 


Now use the definition of a null set to construct an increasing sequence {w,} so 
that w,(t)-> + 00 for teE and 


[U,dt<K<o 


for all », where K is an appropriate constant. 

Now set €,= @, — eK 'w,,. It is clear that {€,} is a decreasing sequence and 
therefore {&,*} is a decreasing sequence. Furthermore lim,., €,*(t)=0 for 
all t. Therefore, by property (1.4), we get 


lim cs dt = 0. 
Since 
bn = 6, + EK "Wy, SE," + eK "Wy, 
we get 
lim {¢, dt < lim [ é,* dt + lim eK7! [Undt<e. 


Now let > 0 and we get lim,.,, J ¢, dt=0. fj 


D.6.3 COROLLARY. Let wWeCp and let {d,} be a sequence of nonnegative 
functions in Co. Assume that 


> >,2 (ae). 
Then), ) >, at = J w dt. 


Proof: Let &, =(w — >7., $)). Then {€,} and {€, *} are decreasing sequences. 
Furthermore, 


limé,* =0 (ae.) 


n~ a 


D.6. CONVERGENCE ALMOST EVERYWHERE 57] 
which implies that lim,..,, | €,* dt = 0. Since 


w a c a v4 a ad + Lo 
we get 
{vars [é,* dt + y [ dr. 


By letting m — oo we get the desired conclusion. J 


D.6.4 COROLLARY. Let {o,} be a sequence in Cy such that )_, b, = 0 (a.e.) 
and Y'~_, | |,|dt < co. Then 


Y [¢, dt =0. 
n=1 
Proof: It follows from Theorem D.5.2 that the set 


E={1: ¥ ld,(ol = +00] 


is a null set. Since ¢,* <|o,| and ¢,” < |@,|, the series }', &,*(t) and ¥, ¢, (t) ) 
converge absolutely for t ¢ E. Hence, for t ¢ E we have 


Yb") — ¥ be = ¥ 16° - 6. W1= ¥ 44(0) = 0, 


that is, )",¢,°O=>~, 6, (t) for téE. Now let m be fixed and set 
w=)", ¢,°. One then has 


and by the last corollary we get 
[var= y | ¢,° dt< > {o,7 dt. 
n=1 n=1 
Now let m— © and we get 
> {o,* dt<¥ ico dt. (D.6.1) 
n=1 n=1 


By interchanging the role of ¢,* and ¢,~ we can prove that the inequality in 
(D.6.1) can be reversed. We thus have 


0= » fat dt— fd a 


n=1 


Z y (a dt—[$,7 at) 
2 x [(bn? - by) dt = y [ond | 


572 APPENDIX D. INTEGRATION AND MEASURE THEORY 


D.6.5 COROLLARY. Let {¢,} and {W,} be two sequences from Co satisfying 
the following: 


EH=Le |e) 


Y [Ida dt < 00, Y fla dt < ©. 
Then 


YJ o.dt=> fu, at. 


Proof: Leté,= 9, —w,. It is a simple matter to show that {€,} satisfies the 
hypotheses of the last corollary. One then has 


0=> [édt=¥ fb,dt—¥ [un dt. Fl 


7. THE LEBESGUE INTEGRAL 


Let us summarize the conclusions of the last section. Suppose we are given 
a sequence {@,} in Cy with the property that )., { |¢,|dt < oo. Then we know two 
things: 


(1) The series },, ¢, converges almost everywhere, and 
(2) The series )’, | ¢, dt converges. 


If fis a function that satisfies 


f= Lo, (ae), 


then it seems natural to define the integral of f by 


[fat=¥ | bn dt, 


which is what we do. 


D.7.1 DEFINITION. Let f: R— R be a function that satisfies f=), ¢, (a.e.) 
for some sequence {¢,} in Co with ).,, | |?,|dt < oo. Then f is said to be Lebesgue 
integrable and the (Lebesgue) integral of f is given by 


[rat=¥ [on dt. 


We shall let L(— 00,00) or L denote the class of Lebesgue integrable functions. 
This definition is well formulated since the value of | f dt does not depend on 
the approximating sequence {@,} (see Corollary D.6.5). 
It follows that every function in Cy is Lebesgue integrable, that is, Co < L 
and the Lebesgue integral agrees with the Riemann integral. Indeed, if fe Cg, let 


D.7. THE LEBESGUE INTEGRAL 573 


¢,=f, >, =0, n=2,3,.... Then f=), ¢, and >, J |¢,|dt = | |b, |dt < «0. So 
fis Lebesgue integrable. Finally 
ffat=> [o.dt= [dy at, 


that is, the two integrals agree. 
It is convenient to have a characterization of L in terms of sequences. 


D.7.2 THEOREM. Let f: RR be given. Then fe L if and only if there is a 
sequence {w,} in Co that satisfies 


limy,=f (a.e.) 


(D.7.1) 
lim [lv — w,,| dt = 0. 
dn this case one has 
{ fdt = lim { Wy, dt. (D.7.2) 


Proof: Let fe L and let {¢,$ be a sequence in C, that satisfies the conditions 
of Definition D.7.1. It is now easy to check that the sequence 


im = % ob, 
satisfies (D.7.1). 


Conversely, if {y,} is a sequence that satisfies (D.7.1), then we can find a sub- 
sequence {y,,} such that 


[ees — Yad dt < 2-*, 


If we set d, = W,,, and @, = W,, — Wn,_, for k = 2, 3,..., then it is easy to check 
that {¢,} satisfies the conditions of Definition D.7.1. Hence fe L. 
We shall leave the proof of (D.7.2) as an exercise. J 


In the next theorem we shall list some of the elementary properties of the 
Lebesgue integral. While these properties are important they do not form our 
main objective. The most important properties of the Lebesgue integral will be 
discussed in the next section where we present the basic limit theorems. 


D.7.3 THEOREM. Let f,géL and let a and B be real numbers. Then the 
following statements are valid: 


(a) af + Bg is in L and | (af + Bg)dt =a) fdt + B\g dt. 
(b) |f| is in L and |\ fadt| < {|f| dt. 

(c) If f= 0, then | fdt = 0. 

(d) fA gandf v gare in L, 


574 APPENDIX D. INTEGRATION AND MEASURE THEORY 


Proof: By using Definition D.7.1 and Theorem D.7.2 one can find sequences 
{hn}, {Wa}, and {€,} in Cy such that f= )'n Pn (2-€.),9 = Din Wn (a.€.), f = lim €, (a.e.), 


and where 
Y flbal dt <0, flval dt < oo, lim [lé,—E,1 dt 0. 


(a) Let E, and E, be null sets with the property that for ¢¢ Z, one has 
f(t) =>, ¢,(t), and for t¢ E, one has g(t) = ),, W(t). Now define 


Bs= {05 1d(01 = +00}, 


By = [sD W,(01= +00} 


It follows that E, and E, are null sets and E = | j#., E£; is a null set. It is easy to 
see that for t ¢ E one has 


af(t) + Bg(t) =} ab, (t) + BY,(2). 


n 


Furthermore, 
Y flab, + BWal dt <¥ lal fibgl dt + Y 1BI [Wal dt < 00. 
Hence af + Bg is in L, and 
[(of + Bg) at = [(ab, + BY,) at 
=a> [on dt+By [yp at 
=a [fart Bp {g dt. 
(b) Since If | = lim Sal (a.e.) and | iin a [onl | < Cin pre al we see that 
Lim [[l&ml — [ql | dt 0. 


Hence |f| is in L by Theorem D.7.2. Furthermore, 


feed 


[eat] < Lim | 1é d= [ifl dt. 


< fle, dr, 


SO 


| | fai = lim 
(c) If f> 0, then f = lim |é,| (a.e.) and by the last paragraph one has 


[fat= Lim | él dt > 0. 


D.7. THE LEBESGUE INTEGRAL 575 


(d) This follows from (a) and (b) once one observes that 


fag=2ft+g9)—2If-g9! 
fvg=%Xft+9)+3lf-gl. I 
EXAMPLE 1. We have shown that C, lies in L. Let us now show that certain 
piecewise continuous functions lie in L. To begin let f: R->R be a bounded 
function with the property that 
f: (a,b) > R is continuous, and 
f(t) = 0 for ¢ ¢ [a,5], 


where a < b. Let us show that fe L (see Figure D.7.1). Since fis bounded, there is 
an M such that | f(t)| < © for all t. Choose monotone sequences {a,} and {b,} in 


Figure D.7.1. 


[a,b] so that a,<b,, a=lima,, and b=limd,, for example, a,=a-+ 1/n, 
b, = b — 1/n. Define ¢, in Cy so that |¢,(t)| < M fora <t<b and 
PnA(t)=f(t), at <b, 
=(), t¢[a,b]. 


Then f= lim @, (a.e.). Furthermore for 1 > m one has 
[len — Pmidt < 2M(\dy, — al + |b — By|) +0 


as n,m-— oo. It follows from Theorem D.7.2 that fe L. Furthermore, it is easy to 
see that | fdt = {> f dt, where |? f dt is the Riemann integral of f- 

Since a finite linear combination of functions in L is also in L, we see that any 
bounded, piecewise continuous function with compact support is in L. In particular 
any step function with compact support is in L. A step function is defined to be a 
piecewise constant function.’ J 


We will need another characterization of L in the next section. 


1 We warn the reader that some authors usc the term ‘‘step function” in a somewhat different 
manner. 


576 APPENDIX D. INTEGRATION AND MEASURE THEORY 


If feL, then there is a sequence {¢,} in Cy with f=) ~., ¢, (a.e.) and 
21 [ \b,|dt < co. Now let m>1 and define py =)" / 6,. and y= Ome: 


n 


Then f = Sige. W, and Vire1) \Wldt < 00. Since Vee 2 J lWaldt = YP m+2 J lPnldt > 0 
as m-— oo, we see that the series ) > | |W,|dt can be made arbitrarily small by 
choosing m large. We thus have the following result. 


D.7.4 THEOREM. A function f is in L if and only if for every ¢ > 0 there is a 
sequence {w,} in Co that satisfies 


f= 3 Ww, (a.e.), (D.7.3) 


and 


YS [la ea 


We leave it as an exercise for the reader to show that if fe L and {wy,,} satisfies 
(D.7.3), then 


{\f- Wil Se, (D.7.4) 


and 


Y [ial at < [isl de + 26 (D.7.5) 


EXERCISES 


1. Prove Equation (D.7.2). 
2. Show that (D.7.3) implies (D.7.4) and (D.7.5). 
3. Let fe L with [| f|dt = 0. Show that f= 0 (a.e.). 


8. LIMIT THEOREMS 


In this section we come to the heart of the Lebesgue theory. The limit theorems 
we shall now discuss have an importance which is hard to overemphasize. We ask 
that the reader take particular note of them. 


We are primarily interested in three types of limit theorems. Let {f,} be a 
sequence of functions in L and let f= lim f,. We will consider the following cases 
I. MONOTONE CONVERGENCE f, </f.44. 


II. .NONMONOTONE CONVERGENCE (FATOU’S THEOREM). 


III. DOMINATED CONVERGENCE |f,(t)| <g(t). 


D.8. LIMIT THEOREMS 577 


One of course expects a stronger conclusion with a stronger hypothesis, and this 
will indeed be the case. Each of the theorems on sequences will have a counterpart 
for infinite series. 

We begin this section with a discussion of the Levi Theorem for infinite series, 
f=), Jf,- This theorem is a capsule version of the last three sections. The important 
thing to note is that the functions f, now belong to L and not merely to Co. 


D.8.1 THEOREM. (LeEvi.) Let {f,} be a sequence in L with ¥°°_, (|f,| dt < 00. 
Then the series ., f, converges almost everywhere. Moreover, if f = Y, f, (a.e.), then 
feL and 


frar=¥ [sae 


Proof: It follows from Theorem D.7.4 and (D.7.5) that for each xn, there is 
a sequence {w,,} in Co with f, = V1 Way (a.e.) and 


Y [Waal dt < fifi dt+2- 
k=1 J 


Let E, be the null set for which f,(t) = ¥, W(t) for t ¢ E,,. Since 
ys a) | Wak dt < ara dt F > 5 7 i < OO, 


it follows from the definition of a null set, that there is a null set E such that the 


series )\, >, W(t) converges (absolutely) for ¢ ¢ E. Hence, if ¢ is not in the null set 
Eu (UU, £,), the series 


~~ 


LAC) =D, Welt) 
n n ek 
converges, that is, this series converges almost everywhere. 


If f=, f, (ae.), then f= >, Yi, Wy, (a-e.) and thus fe L. Furthermore, 


[rar=¥(¥ [uu dt) =¥ [hate | 


Next we turn to the Monotone Convergence Theorem, which is a simple 
corollary of the Levi Theorem. But before we do that, let us recall that a sequence of 
functions { f,} is said to be increasing if f, < f,,, for all n. If one has f, > f,4,, then 
the sequence is said to be decreasing. 


D.8.2 THEOREM. (MONOTONE CONVERGENCE.) Let {f,} be an increasing 
sequence in L with | f, dt <M < « forall n. Let f=lim/f,, then fe L and 


[fat = lim [f, dt. 


n~ca-’ 


578 APPENDIX D. INTEGRATION AND MEASURE THEORY 
Proof: Let g, =f, —S,-1, where fo = 0. Then g, > 0 and 
Y fig dt = fo, dt = 
and f=)_, 9, (a.e.). Hence by Levi’s Theorem fe L and 


[fat=> |g, dt = lim ff, dt. i 


lim [f, dt < 00, 


noo 


One can obviously get the same conclusion if one assumes instead that { f,} 
is a decreasing sequence and | f, dt is bounded below. 
The next result 1s concerned with unrestricted convergence of a sequence 


{fn}: 


D.8.3 THEOREM. (FATOU.) Let {f,} be a sequence of nonnegative functions in 
L with 


lim inf {f, dt = lim inf | f, a <0 


n— oo noo in 
and assume that f = lim f, (a.e.). Then fe L and 
[fat < lim inf | f, dt. 


Proof: Let xn be fixed and for each integer k > 1 let 


linn =n NSnta 008° A Sntk: 


It follows from Theorem D.7.3 (d) that h,,¢L and 0<h,,<f,. Furthermore 
{h,,,} is decreasing in k. If we let 
g, = limh,, 


k-> 00 


it follows from the Monotone Convergence Theorem that g, ¢ L and 


,at=lim th, ,dt< |f, dt. 
[g 8 | ak [4 
It is clear that g, is increasing in n. Since f= lim g, (a.e.) one has fe L and 


[fat= lim |g, dt <liminf [f, dt. I 


n-> 00 noo 


The Fatou Theorem can be reformulated for infinite series. Let gy = 0-1 f;, 
and assume that gy > 0 for all N. Furthermore, assume that 


N 
liminf Y | f,dt <0, 


N->o n=1 


and thatg =) ~_,/, (a.e.). Then ge L and 


[g dt = [XP dt <lim inf y le dt. 


No ns 


D.8. LIMIT THEOREMS 579 
We now turn to the most important of the limit theorems. 


D.8.4 THEOREM. (DOMINATED CONVERGENCE.) Let {f,} be a sequence in L and 
assume that f = lim f, (a.e.). If there is ag € L such that|f,,| < g (a.e.), then fe L and 


fdt = lim | f, dt. (D.8.1) 


n-> 00 


Proof: SinceO<g+/, < 2g we know that 
lim inf [(g 4+f)dt< {29 dt < 0. 


Hence by Fatou’s Theorem g + f = lim(g + f,) (a.e.) is in L. Thus f=(g +f) -—g 
is in L. Furthermore, 


[(@ +f) dt < lim inf [(g +f.) dt, 


which implies that 
{ fdt <lim inf { f, dt. (D.8.2) 


n-> 0 


Similarly we have 0 < g —f, < 2g, so 


[(g —S) dt <lim inf [(g —f) at, 
which implies that 
— [fat <lim inf [(—f,) dt = —lim sup [f, dr, 


n> oo 


Or 
lim sup f,dt< f dt. (D.8.3) 


By combining (D.8.2) and (D.8.3) we see that lim,.,, | f, dt exists and (D.8.1) 
is valid. J 


D.8.5 COROLLARY. Let f=lim/f, (a.e.), where {f,} is a sequence in L. If 
[fl <g (a.e.), where ge L, then fe L. 


Proof: Define g, by 


g(t), —ifg(t)<f,(t), 
g(t)=if(t), if —g(¢)=f,(4) <a(2), 
—g(t), if f,(t)<—2(t) 
or equivalently 
In=9 ALS VY (-9)]. 


580 APPENDIX D. INTEGRATION AND MEASURE THEORY 


Hence g, € L,|g,| < g (a.e.) and f= lim g, (a.e.). So by the Dominated Convergence 
Theorem we see that fel. J 


D.8.6 COROLLARY. Let f= lim f, (a.e.), where {f,} is a sequence in L. Then 
feL if and only if |f| €L. 


Proof: This follows from the last corollary (let g =|/|) and Theorem 
D.7.3 (b). J 


The Dominated Convergence Theorem can also be reformulated for infinite 
series. 


D.8.7 CoROLLaRY. Let {f,} be a sequence in L and assume that f=), f, 
(a.e.). If there isag€L such that 


N 
Lh 


<lgl (ae) 


for all N, then fe L and 


fdt= { PyE dt = > { f, dt. 


EXERCISES 


1. Extend Fatou’s Theorem to the following case. Let {f,} be a sequence of non- 
negative functions, where 


lim inf [f, dt < ©. 


noo 


Show that lim inf f(t) exists almost everywhere and if f= lim inf f(t), then 


[fat < lim inf [fF dt. 


2. Consider the differential equation X’ = f(X,t) where f: R x R-R is a con- 
tinuous function. Assume that | f(X,2)| < m(t) for all X in R and ¢ in some 
interval J = [a,b] where m(t) is the Lebesgue integrable. Let { X,,(¢)} be a sequence 
of solutions of this equation on J and assume that X,(t)— X (t) for all ¢ in J. 
Show that X, is a solution. [Hint: Convert the differential equation into an 
integral equation X(t) = X(a) + |i, f(X(s), 5) ds. Now apply the appropriate limit 
theorems. ] 


3. A typical control theory problem can be formulated as follows: Consider the 
differential equation X’ = f(X,t,u), where f is continuous and uw is a control 
parameter. One seeks a control u(t) such that the solution X(t)satisfies X(0) = Xo, 
and X(1) = X,, where Xo and X, are fixed. (If u does this we say that it is 
admissible.) Furthermore, we require u to be chosen so that the integral 


D.9. MISCELLANY 58] 


folu(t)|ae assumes its minimum value. Assume that one has found a sequence of 
controllers {u,} such that 


1 <1 
| |u,(t)| dint} | |u(t)| dt: u is admissible 
1) 0 


Uy =limuy, (a.e. on [0,1]), 


where up is admissible. Show that 
1 1 
| Juo(t)| dt = inf | lu(t)| dt: u is admissible|, 
0 0 


that is, up is an “‘ optimal’’ control. 


4. First let f, = 0 satisfy {| f,|? dt < oo, for some p with 1 <p < and then let 


fal, =(f I Al? at)'/?. Assume that ¥., fll, < 0. 
(a) Show that the series ), f, converges almost everywhere. 


(b) If f=), f, (a.e.) show that |f ||, <>, All,- 
[Hint: Let g, =) 7-,f; and use the Minkowski Inequality to show that 


all> < dt=1 Will). Now apply the Monotone Convergence Theorem.] 


5. Let f be a Riemann-integrable function on the interval J = [a,b]. Assume that 
f(t) =0 for té J. Show that fis in ZL and that the Riemann integral of f and 
the Lebesgue integral of f agree. [Hint: Use Theorem D.2.3 and the Monotone 
Convergence Theorem. ] 


6. Prove Corollary D.8.7. 


9. MISCELLANY 


In this section we collect a number of refinements or extensions of the notion 
of the integral presented above. 


Complex-Valued Functions 


Let f= u + iv, where u and v are the real and imaginary parts of f. If both u 
and v belong to L, we say that fe L and define 


[fde=[udr+ifovde. 


We invite the reader to prove the Dominated Convergence Theorem for 
complex-valued functions. Another interesting (and useful) fact is to show that 
for a complex-valued function f one has fe L if and only if|/| € L. 


Vector-Valued Functions 


If f: R— R", then one has f = (f,,/2,...,4,), where f;; R> R,1 <i<n. Now 
define { fdt to be the vector 


[fat ms ( [f dt, [f dt,..., [sndt) 


582 APPENDIX D. INTEGRATION AND MEASURE THEORY 


provided f;ée L, 1 <i<n. We ask the reader to show that 


| [reel] < [siete 


where ||: || is any norm on R”. 


Characteristic Functions and Measure 
If A is a set in R, then the characteristic function of A, y,, 1s given by 


(1) = l, ifte A, 
KA No if te A. 


If ¥,€L, then we shall say that A 1s measurable and define the Lebesgue measure 
of A by 


m(A) = y, at. 


More generally, if y, is a characteristic function and A = |)*_, A,, where {A,} 
is a sequence of mutually disjoint sets with x, eZ, then we shall say that A is 
measurable and define the Lebesgue measure A by 


m(A) = bY fu, dt, 


which may be +00. One may ask whether m(A) depends on the sequence {A,} 
The answer, as seen in Corollary D.6.5, is negative when )'°., m(A,) is finite. 
Moreover, even when this sum is +00, the answer is the same. 

What are some of the properties of the collection of measurable sets? We 
shall list them here and ask the reader to verify our assertions. 


1. If A is measurable, then the complement A’ is measurable. 


2. If {A4,,A,,...} is a sequence of measurable sets, then |)", A, and 
(\-., A, are measurable. 


3. Every interval is measurable. 
4. Every open set and every closed set is measurable. 


5. Let A be a measurable set with measure zero, that is, m(A) =0. Then A 
is a null set. 


6. Let {A,,A,,...} be a sequence of mutually disjoint measurable sets, 
then 


m( ) A,] = > m(A,). 
n=1 n=1 
7. Let A and B be measurable sets with A € B. Then m(A) < m(B). 


Measurable Functions 


A function f: R— R is said to be measurable if f~'(0) is a measurable set in 
R whenever 0 is an open set in R. (Compare this with the characterization of a 


D.9. MISCELLANY 583 


continuous function in Theorem 3.9.7.) It is easy to see that f is a measurable 
function if and only if 


{t: f(t) < a} (D.9.1) 


is a measurable set for every a in R. One can also replace the inequality < in 
(D.9.1) by either <, or >, or > without changing the collection of measurable 
functions. 

Many of the properties of measurable functions are easy consequences of the 
above properties. For example, 


1. Every continuous function is measurable. 


2. If f and g are measurable, then fa g, fv g, fg, and f+ g are measurable. 


3. If f= lim f, (a.e.), where the functions f, are measurable, then fis measur- 
able. (Hence the functions in L are measurable.) 


4. If {f,} is a sequence of measurable functions, then sup{fi,f>,...} and 
inf {f,,f/>,...} are measurable. 


5. If fis a measurable function with |f| < g (a.e.), where g EL, then fe L. 


The Notation |, fdr 
Let fe L and let A be a measurable set. One then defines |, fdt by 


| fae = [r- 4 dt. 


If A is the interval [a,b], then one oftentimes writes 


[ far= [rae 


If A = R, then J, f dt becomes | fdt which is sometimes written as 


[rat = fe Pde 


i 
2 


Figure D.9.1. 


The Cantor Set 


Let us now consider an example ofa particularly interesting subset of E = [0,1]. 
We will now define a family {A,} of disjoint sets where each A, 1s the union of a 
finite number of intervals. Each A, will be open and A= (J% , A, will be an 
open set in E. The complement A,’ = £ — A, will, of course, be closed and each 


584 APPENDIX D. INTEGRATION AND MEASURE THEORY 


A,’ will be nonempty. Furthermore, the sequence Cy = (\f-, A,’ will be chosen to 
be a decreasing sequence of nonempty closed sets, and therefore the intersection 


Ce (\ A,’ = () Cy 
n=1 N=1 


is nonempty, closed, and compact. (Refer to Exercise 32, Section 3.17.) The set C 
will be the Cantor set. Now for the specifics. 

Let A, be an open interval of length a, 0 <a<_1, with center at 4, that 1s, 
A, = (1/2 — a/2, 1/2 + a/2). (We will be particularly interested in the case a = 4, 
then A, = (4,4) 1s the (open) middle third of the interval EF = [0,1].) The complement 
A,’ is then the disjoint union of two closed intervals, J,' and J,’ of equal length, 
m(,') = mU,') = (1 — a)/2. Now let r be a fixed real number satisfying 
O<r<i1-—a. See Fig. D.9.1. 

A, is defined to be the union of two open intervals J,’ and J,', each of length 
ar/2, where J;' and J;' (i = 1,2) have the same center. (In the special case a= } 
and r = 4, the intervals J,’ and J,’ form the middle third of the interval J,’ and 
I,', that is, A, = (4,2) UG,8).) The complement of A, UA, now consists of 
four disjoint closed intervals /,*, /,7, 1,7, and I,? of equal length, (1 — a — ar)/4. 

Now assume that A,, A,,..., A, have been chosen so that they are disjoint 
and open, and that each A,, k = 1,2,...,, consists of 2*~' open intervals of 
equal length with m(A,) =a: r*~!. Also the set E—(A, U++: UA,) consists of 
2* disjoint closed intervals J,*,..., J,“ of equal length. It is easy to see that 


m*) =2-*1 —a-—ar—°++:—art'), G=1,..., 2%). 


We now define A,,, as follows: A,,, will be the union of 2” open intervals 
Jr, ee5 Jon", each of length 2° "ar", where J,” and J," have the same centers. 
Then m(A,,,) = ar" and the set E—(A,;U-:: UA,,,) consists of 2"*! disjoint 
closed intervals of equal length. 

The open sets A, have now been defined for all ”, by mathematical induction. 
They are disjoint and m(A,) = ar"~'. Therefore, if A = | )°%,A,, then A is open 
and measurable and by property (6) for the measure m one has 


a 


m(A) = yar =F 


— r i‘ 
The complement of A, C = E — A, is nonempty, closed, and measurable and 


l—r— 

m(C) = m(E) — m(A) = ———— 
C is defined to be the Cantor set. The following theorem gives the pertinent 
properties of the Cantor set. 


D.9.1 THEOREM. Let C be a Cantor set in [0,1]. Then C is nonempty, closed, 
compact, nowhere dense and) < m(C) < 1. Moreover, C contains uncountably many 
points. 


D.9. MISCELLANY 585 


Proof: We have already shown that C is nonempty, closed, compact, and 
0 < m(C) < 1. Let us now show that C is nowhere dense. This means that Int C, the 
interior of C, is empty. Let x eC. We want to show that there does not exist a 
nontrivial open interval B,(x) = {y: |x — y| < &} which lies entirely in C. In other 
words, every nontrivial open interval containing x also contains points in the com- 
plement C’. 
Let A, be defined as above and let C, = E—(A,U°:::UA,). Then C, con- 
sists of 2” disjoint closed intervals, each of length 


2-"“1 —a—ar— +++ —ar"™)<27", 


Therefore, if ye C, and 6 > 27", then it is also true that B;(y) does not lie entirely 
in C,, since m(B,(y)) =2:d6>2°". 
Now 


a (Cn 


so if x eC, then x € C, for every n = 1, 2,.... Now let ¢ > 0 be given and choose 
nso that ¢ >2°". Then B,(x) does not lie entirely in C,, that is, there 1s a y in 
B(x) such that y ¢ C,. Hence, y ¢ C, so we have shown that C is nowhere dense. 

We will leave the proof of the last statement, that C is uncountable, as an 
exercise. f 


In the special case where a = 4 and r = 4, we see that m(C) = 0, that is, the 
Cantor set has measure zero. This is true more generally when a 1s arbitrary 
(0<a<1)andr=1-—a. However, if 0<r<1-—a, then m(C)>0. That is, C 
is then a set which has positive measure, but it does not contain a nontrivial 
interval! 


EXERCISES 


1. By using the definition of the Riemann integral above show directly that 7¢ is 
Riemann integrable if and only if m(C) = 0. 


2. Let f:[a,b] > R be given and define 
Discont(/) = {t € [a,b]: fis not continuous at t}. 


It can be shown (see Riesz and Sz. Nagy [1, pp. 23-24]) that the function f 
is Riemann integrable if and only if Discont(/) is a set of measure zero. Let 
yc be the characteristic function of the Cantor set. Show that y, is Riemann 
integrable if and only if m(C) = 0. 


3. Extend the concept of a measurable function to complex-valued functions. Do 
the same for vector-valued functions. Which, if any, of the properties of real- 
valued measurable functions are valid for vector-valued measurable functions? 


586 APPENDIX D. INTEGRATION AND MEASURE THEORY 


10. OTHER DEFINITIONS OF THE INTEGRAL 


The construction or the definition of the (Lebesgue) integral we have pre- 
sented is based on the Daniell approach. That is, we began with a space of functions 
Co satisfying properties (I.1), (1.2), (1.3), and (1.4) in Section 4. This approach can 
be generalized. What is really needed is that at the outset we be given a set & of 
functions, call them elementary functions, and an integral defined on & so that the 
four properties of Section 4 are satisfied. We then extend the integral so that it is 
defined on a superset of &. In this section we shall point out two such generalizations 
and invite the reader to carry out the details of the construction of the integral. 


EXAMPLE 1. (STEP FUNCTIONS.) Let & denote the collection of all real-valued 
step functions defined on R and with compact support. A function @€é can be 
described as follows: There are two sequences of real numbers {f9,¢,,...,¢,} and 
{01,0 ,...5%,¢, Which depend on ¢, such that fo < t; < ++: <7¢, and’ 


o(t) =4a;, [j-~<f<f;, i — > Peres B 


D.10.1 
p(t) = 0, t<t)ort, <t. ( 


Define the integral of @ by 
odt= Yat; —t;-,), (D.10.2) 
i=1 


where @¢ satisfies (D.10.1). It is easy to see that this class of functions &, together 
with the integral defined by (D.10.2), satisfy properties (I.1), (1.2), and (1.3) in 
Section 4. Property (1.4) is somewhat harder to prove, and this argument is out- 
lined in the exercises. 

From this point one can use the Daniell approach to define null sets, convergence 
almost everywhere, and finally the Lebesgue integral. In order to show that this 
definition of the Lebesgue integral agrees with that given in Section D.7 one need 
only prove that, in terms of the new integral, every function in Co 1s integrable and 
the integral agrees with the Riemann integral. We invite the reader to check these 
assertions. J 


EXAMPLE 2. (MEASURE SPACES.) Let X be a nonempty set and let # denote 
a o-field of subsets of X. This means that the empty set @ is in #@ and whenever 
{E,,E,,...} belong to Z, then the complement E,' = X — E, and |)”, E, belong 
to B. This also implies that (\7_, E, = J, EZ,’ belongs to &. Let p be a positive’ 
measure defined on &. This means that p maps # into the reals R in such a way 
that n(@) = 0, w(E) = 0 for all E in &, and 


u( U En) = Y mE) 


2 It is not necessary for our purposes to specify the values of d at the points {fo,t;,...,tn}. 
(Why ?) 

3 The adjective “‘ positive”’ refers to the condition y(E) ~: 0. If we drop this condition we have an 
arbitrary measure. (See Section D.14.) 


D.10. OTHER DEFINITIONS OF THE INTEGRAL 587 


whenever {E,,FE,,...} is a sequence of mutually disjoint sets in #. Some specific 
examples of measure spaces are discussed in the exercises. 

We now wish to define an integral on X. For this purpose we shall let B> 
denote those sets E in F that satisfy n(E) < oo. A function @: X > R is said to be a 
simple function if there are a finite number of disjoint sets {E,,£F,,...,£,} in &, 
and real numbers {a ,a,,...,0,} such that a7 =0, {F,,...,£,} belong to Bp, 

jzo E; = X, and o(x) =a,;, if xe E;, O<i<n. We allow the possibility that 
any of the sets F;, including Ey, may be empty. We now define the integral of 
p by 


f ud) = Ya uC). (D.10.3) 


It is easy to see that the class ¥ of all simple functions together with the integral 
defined by (D.10.3) satisfies properties (1.1), (1.2), and (1.3) in Section 4. We ask 
the reader to check this. Let us prove here the generalization of Dini’s Theorem, 
property (1.4). 

In order to do this we will need the following property of the measure uy. 
“If {A,} is a decreasing sequence from &p and (\, A, = ©, then p(A,) > 0 as 
n-> oo.” 

Let {¢,} be a decreasing sequence of nonnegative simple functions with 
lim ¢@,(x) = 0 for all x in X. Let E denote the set in X for which @¢,(x) > 0. 
Then FE B@,. Let M < o satisfy $,(x) < M for all x in X. Since 0 < ¢, < ¢, one 
has ¢,(x) = 0 when x € E, and ¢,(x) < M for all x in X. 

Let ¢ > O be given and define A, by 


A, = {xeEE: ¢,(x) > «}. 


It follows that {A,} is a decreasing sequence in # with (\°_, A, = @. Now let 
Ww, be the simple function 


M, xeéA,, 
W(x) = (8, xe E-—A,, 
0, MEE, 


Then 0 < @¢, < y,, and 


[ 4, ud) <{ Wy u(dx) = eu(E — A,) + Mu(A,) 
< en(E) + MuIA,). 


Now choose WN so that u(A,) < ¢ whenever n > N. One then has 
[ bnu(dx) <(W(E)+ Myc, nn EN. 


Thus lim,.... | ¢, (dx) = 0, which proves (I.4). 

Starting from this point we can again use the Daniell approach to define null 
sets, convergence almost everywhere and the integral { fu(dx). A variation does 
occur when one defines a measurable sect and a measurable function. Specifically 
a measurable set is simply a set in the o-field @. Then a real-valued function 


588 APPENDIX D. INTEGRATION AND MEASURE THEORY 


f: X > Ris said to be measurable if for every open set O in R, f ~'(O) is a measurable 
set in X. (Also see Exercise 8.) 


An important version of this example occurs when one is discussing probability 


spaces. In that case yw is a probability measure and p(X) = 1. We shall treat this 
case further in Appendix E. J 


EXERCISES 


I. 


Show that Properties (1.1), (1.2), and (1.3) hold for the integral in (D.10.2) 
defined on the step functions @. 


. Show that Property (1.4) holds for the integral on the step functions @. 
. The Borel field B in R is defined to be the smallest a-field of sets from R that 


contains all open intervals (a,b). The sets in the Borel field are called Borel sets. 

(a) Show that every interval is a Borel set. 

(b) Show that every open set is a Borel set. 

(c) Show that every closed set is a Borel set. 

(d) If J = (a,b), let mJ) = b — a. Show that there is one and only one measure 
u defined on & that satisfies u(/) = m(/) for every open interval J. (This 
measure is called the Lebesgue measure.) 

(ec) Show that the definition of the integral in Section 7 agrees with the definition 
in Example 2 when the measure yp is the Lebesgue measure. 


. Let fe L and f> 0. Let # denote the Borel sets in R. For Fe & let 


mE) =| f> xe at. 


(a) Show that yp is a positive measure on R. 
(b) Show that if g is a simple function, then | gu(dx) =J f+ g dt. 
(c) What can you say about { gu(dx) when g is not a simple function? 


. Let Z be the Borel field in R and define p on # by 


l, ifO0cE, 
We) = = ifO¢ E. 


(a) Show that yp is a measure on R. 
(b) Compute { gu(dx) when g is a continuous function. 


. The Borel field on R” is the smallest o-field of sets from R” that contains all 


open rectangles {(x,,...,x%,): @; <x; < 6;,i = 1,2,...,n}. Define the Lebesgue 
measure on R" by defining 


m({(x,). . sg005)8 Qa; < Xj < b;,i = 1,2, oe .n}) = I] (6; = a;) 
i=1 


for each rectangle and then extending m to #. Construct the Lebesgue integral 
R". 


. Let L be the space of Lebesgue-integrable functions on R and let { denote the 


Lebesgue integral. 
(a) Show that L satisfies properties (1.1), (1.2), (1.3), and (1.4). 


D.11. THE LEBESGUE SPACES L, 589 


Use the Daniell approach starting with L to answer the following questions: 


(b) Describe the null sets in terms of Section D.5. 
(c) Describe convergence almost everywhere in terms of Section D.6. 
(d) Describe the extended integral in terms of the Lebesgue integral. [Hint: Use 
Levi’s Theorem. ] 
8. Use the definition of measurable function in Example 2 and the definition of 
Borel set in Exercise 3 to show that a function f: X — R is measurable if and 
only if f~ '(B) is measurable in X for every Borel set Bc R. 


11. THE LEBESGUE SPACES LZ, 


The Lebesgue spaces L, play a central role in the study of functional analysis. 
These spaces are employed extensively in the text. Here we shall limit our discussion 
to the definition and the proof of a basic theorem, which says that the Lebesgue 
spaces are complete. 


D.11.1 DEFINITION. Let E bea measurable set in R and let p satisfy 1 < p < 00. 
A function f: E> R is said to belong to L,(£) if 


[ Lf? dt < 0, 
E 


or equivalently, if |f- x-|? belongs to L. Moreover, a norm is defined on L,(E) by 
WF llp = Se FI? at)”. 


D.11.2 THeorem. Let {f,} be a sequence in L,(E),1 <p < 0, with the 
property that 


lim [lf — fj? dt=0. 


m,n oo 


Then there is a subsequence { f, } and an fe L(£) that satisfies f=\im f,, (a..) 
and 


lim [\f—f,, at=0. , 
dim ff Jn (D.11) 


Proof: It will suffice to assume that E = R. If not, we could replace /, by f,,, 
where 


_ffilt) te, 
f(t) _ 2 t d¢ E. 


Then { f,} satisfies 
[Pm — Sal? at = f Vn — Sal? at 
E 
Take the case p = 1 first. If 


lim [Sin — Ful dt = 0, (D.11.2) 


m,n oo 


590 APPENDIX D. INTEGRATION AND MEASURE THEORY 


then we can find a subsequence {/,,} that satisfies 


[ness — Sd dt < 27 


Now set g; =fn,> Gn =In, ~Sm-1> K = 2,3,.... Then the sequence {g,} satisfies 
Yie1 J lal dt < 00. So by the Levi Theorem D.8.1 there is an fin L such that 


oO 
=) g,=limf,  (a.e.). 
k=1 1 


It follows from (D.11.2) that for every e > 0 there isan M such that for n,,n >M 
one has 


| Ve — Sal dt <8. 


By now letting n, — 00 we get 


\\f-fldt<e, 


hence (D.11.1) is true for the case p = 1. 
The proof for the general case 1 < p < oo Is similar. First choose a subsequence 
{f,,3$ So that 


1/p 
(faces Jal? at)” <2" 


and define {g,} as above. It follows from Exercise 4, Section D.8 that the series 
> c<119,| converges almost everywhere, hence the series > 1 9, converges almost 
everywhere. If we let 


f= 2a 5 = tm fin (a.e.), 


then we can repeat the above argument to show that fe L, and that (D.11.1) 
holds. J 


There is also a Lebesgue space for p = 00, which we now define. 


D.11.3 DEFINITION. Let E be a measurable set in R. Then L,,(E) will denote 
the collection of all bounded measurable functions on E. The essential supremum of 
a function fe L,(£) ts given by 


eu | f| = inf{B: | f| < B(a.e.)}. 


If f is continuous, it is easy to see that ess. sup. | /| agrees with sup | /|. 
There is a completeness theorem for L,,. We leave the proof as an exercise. 


D.11.4 THEOREM. Let {f,} be a sequence in L,,(E) with the property that 


lim (ess sup. |/, - ful) = 0. 
E 


m,n oo 


D.12. DENSE SUBSPACES OF L, 591 


Then there is a subsequence { f,,} and an f in L,( E) that satisfies f=lim f, (a.e.) 
and 


ae ( ess. sup.|f— Jy) = 0. 


The concepts and results of this section can be extended to complex-valued 
functions and even vector-valued functions. Furthermore, these results are valid 
for any integral that is defined using the Daniell approach. In particular, these 
results extend to the integral defined on general measure spaces in Example 2, 
Section D.10. We shall discuss these extensions in the exercises. 


EXERCISES 
1. Prove Theorem D.11.4. [Hint: Show that the set 


{t: f,(t) is not a Cauchy sequence} 
is a null set.] 


2. Prove Theorems D.11.2 and D.11.4 for complex-valued functions. 
3. Prove Theorems D.11.2. and D.11.4 for vector-valued functions. 
4. Determine f where f = lim f, (a.e.) and 


_ {n/2, \¢| < 1/n, 
fit) = 3 It] > Ln. 


Does | |f—J,|? dt +0 asn— co for any p, 1 < p < «0 ?Doesess. sup. | f—f,| > 0 
as m— oo? 

5. Converse of Hélder Inequality: Assume that |f_ f(Qg(2) dt| < K(Jelg(o|* at)! 4 
for every géL,(£). Show that feL,(E), where p-'+q~*=1, and that 
elf? at)? < K. 


12. DENSE SUBSPACES OF L,,1<p<® 


In the study of certain linear operators, particularly unbounded operators, 

one is interested in knowing whether the domain, or the range, of an operator is a 

dense subspace of some normed linear space. In this section we wish to prove an 

omnibus theorem which asserts that the space Cpo~ = Cy)“(R,R), the space of 

infinitely differentiable functions with compact support, isdenseinL,,1 <p < oo. It 

is immediate then that if any subspace of L, contains Co”, then 9 is dense in L,. 
Consider L, = L,(— 0,00) with the usual norm 


IF lp =f I? at)”. 
It is a consequence of Theorem D.7.4 and Inequality (D.7.4) that the space Cg is 
dense in L, = L. It is easy to show that Cp is dense in L, for all p, 1 < p< o, as 
we shall now see. Let p be fixed with 1 <p < 00 and let feL,. Define f, by 


h, if f(t)>n 


A,w=({ff), if —n<fO<n 
—n, if f(t)< —n. 


592 APPENDIX D. INTEGRATION AND MEASURE THEORY 


For n= 1,2,... one has |f,|’ < |/|?, hence f, ¢ L,. Furthermore, lim f, =f (a.e.) 
and 


If —Sal? SF) + USD? S 2? 171. 


By the Dominated Convergence Theorem, one has 


IS —fillp = IS -AI? at)''? +0, as n>. 


Conclusion: The bounded functions are dense in L,, 1 < p < ©. 


Let fbe a bounded function in L, and let f, = fy[—n,n], where y[—1,n] is the 
characteristic function of the interval [—7,n]. One then has |f,|? < |/|? hence 
t, € L, and f = lim f, (a.e.). By repeating the reasoning of the last paragraph we can 
show that || f—/,|| ~0 as 1— oo. 

Conclusion: The bounded functions with compact support are dense in L,, 
l<p<o. 

If fis a bounded function with compact support and fe L,, then we claim that 
feL,. Indeed, if | f| < M (a.e.), where M is a positive constant, and if f vanishes 
outside the interval [a,b], then | f| < My[a,b] (a.e.), and by Corollary D.8.5 we 
see that fe L,. 


D.12.1 THEOREM. The space Cy is dense inL,, 1<p<o. 


Proof: Let feL, and let ¢ >0 be given. Choose a bounded function g in 
L,, where g has compact support and ||f—gl|, <«. Say that |g| < My,, where 
I = [a,b] is a bounded interval. Since g € L, , it follows that we can find a sequence 
{d,} in Co such that g = lim 4, (a.e.). Without any loss of generality we can assume 
that |¢,| < My,. Then 


19 — Pal? S (lal + [Onl)? < (2M)?x7. 
It follows from the Dominated Convergence Theorem that 
l9-— rl, 70 as noo. 


Now choose n so that ||g — ¢,]||, < ¢.It then follows from the Minkowski Inequality 
that |f—4¢,||,<2« I 


Let us now prove the main theorem of this section. Let Cy® denote the space 
of real-valued infinitely differentiable functions with compact support. 


D.12.2 THEOREM. The space Co” is dense inL,,1<p<o. 
Proof: LetfeL, and let ¢ > 0 be given. Choose ¢ € Cy so that || f— ¢]|, < «. 


For any n = 1, 2,..., let Wy, € Co” that satisfies 0 < w,, ,(t) = 0 for |t| > 1/n and 
| w, at = 1. For example, we might take 


0, if |t| > 1/n, 


1 .; 
Cc exp| a}, if \¢| s I /n, 


D.13. DIFFERENTIATION 593 


where C is an appropriate constant. Now set 


foe) 00 1/n 
bf) =] HWE s)ds= fo dE—Su()ds= [bt — als) ds, 


It follows from the first equality that ¢,, is infinitely differentiable. Since @ has com- 
pact support we see that ¢, € Co”. 
If |d| < M, then 


1/n 1/n 
COs J lot —lva()ds<M J vals) ds = M, 


Next we claim that ¢ = lim ¢,. To prove this we shall compute 


$(1)— $0) = 60 - | 


: b(t — syby(s) ds 


1/n 
=| HO - $91 ds, 
Since ¢@ is continuous at ¢, for every ¢ > 0 we can find an N such that 


lo) — Pt — sl <e 


whenever |s| < 1/N. It follows, then, that for n => N one has |@(t) — ¢,()| < «. 

It then follows from the Dominated Convergence Theorem that ||¢ — ¢,]||, > 0 
as n— oo. If we choose n so that ||@ — ¢,||, < e, then the Minkowski Inequality 
assures us that for this n we have || f—4,|| <2. J 


EXERCISES 


1. The space L,, was omitted for good reason. Show that Cy is not dense in L,,. 


2. Let C,°[a,b] denote the space of all infinitely differentiable functions f with the 
property that f and all its derivatives vanish at a and 5. Show that Cy)*[a,b] is 
dense in L,[a,b] for 1 <p < o. 

3. Let C)°(R") denote the space of all infinitely differentiable real-valued functions 
u(x,,...,X,) with compact support. Show that C)°(R") is dense in L,(R’), 
l<p<o. 

4. Let Q denote an open set in R” and let C)°(Q) denote the space of all infinitely 
differentiable real-valued functions u(x,,...,x,) with compact support inside 
Q. Show that Cy°(Q) is dense in L,(Q), 1 < p < 00. 


13. DIFFERENTIATION 


In this section we shall study the relationship between F and f when F and f 
satisfy 


F(t) = F(a) + { £6) ds, a<ts<b, (D.13.1) 


594 APPENDIX D. INTEGRATION AND MEASURE THEORY 


and fe L,[a,b]. If fis continuous, then the Fundamental Theorem of Calculus tells 
us that 

-- =f on ast<b. (D.13.2) 
Conversely, if Fis any C'-function that satisfies (D.13.2), then Fis given by(D.13.1). 
Our interest here is in determining what happens when / is not continuous. 

Our discussion will be a bit sketchy and we will leave a number of nontrivial 
details for the reader to check. The goal is to give a complete characterization of 
functions F that satisfy (D.13.1) for some fin L,[a,b]. We shall see that F satisfies 
(D.13.1) if and only if F is absolutely continuous. Furthermore, we shall see that if 
F is absolutely continuous, then dF/dt exists almost everywhere, and if dF/dt = f 
(a.e.), then F and f satisfy (D.13.1). 


D.13.1 DEFINITION. Let J be a compact interval. A function F is said to be 
absolutely continuous on I if for every € > 0 there is a 6 > 0 such that whenever 
I, = [a,,5,] are nonoverlapping intervals in J with }7-, |b, — a,| <6, one has 
yi=1 IF (O,) — F(a)| < ¢. If J is an arbitrary interval, we say that F is absolutely 
continuous on I if F is absolutely continuous on every compact subinterval. 


We can now make two observations. First, every absolutely continuous func- 
tion is continuous, and second, every continuously differentiable function is abso- 
lutely continuous. That absolute continuity implies continuity follows directly from 
the definitions. The second statement is a special case of Theorem D.13.3 given 
below. 

We will need the following lemma. 


D.13.2 LEMMA. Let F satisfy a Lipschitz condition on I (that is, there is a 
constant K > 0 such that |F(t) — F(s)| < K|t — s| for all t and s in I). Then F is abso- 
lutely continuous. 


Proof: Since 
VIF) = Flay) < KY Ik a 
we see that if 
. |b, — | < eK~* = 6, 
then = 


DIF) -Fladlse 


This lemma can be generalized to the case where F satisfies a local Lipschitz 
condition on J, which means that for every compact set J c J, there is a K > 0 such 
that 

|F(t) — F(s)| s K|t—s| 


D.13. DIFFERENTIATION 595 


for all t and s in J. (In this case, K gets larger as the interval J gets larger.) By the 
last lemma, we see that F is absolutely continuous on every compact interval J c J, 
so F is absolutely continuous on J. 

The following theorem completely characterizes the class of absolutely con- 
tinuous functions. It also gives the counterpart of the Fundamental Theorem of 
Calculus. We will not prove the entire theorem here since this would require 
techniques which are beyond the scope of this book. For a complete discussion of 
this theorem we refer the reader to Riesz and Sz. Nagy [1], pp. 50-54] and Royden 
[1, pp. 80-92]. 


D.13.3 THEOREM. A function F defined on I is absolutely continuous if and only 
if F satisfies (D.13.1) for some f é L,[a,b]. Moreover, under this condition, F' exists 
a.e. and F’ = f (a.e.). 


Proof: We will prove here that if F can be expressed by (D.13.1) then F is 
absolutely continuous. Let J be a compact interval in J. If fis bounded on J, say 
|f(t)| < B (a.e.) on J, then F satisfies a Lipschitz condition on J, since for ¢ and s in 
J with t > s, one has 


FO) — FO < f Ifwidu < Bits), 


Hence by Lemma D.13.2, Fis absolutely continuous. If fis not bounded on J, then 
for every € > 0 we can write f= g +h, where g is bounded on J (say that |g| <n 
(a.e.)) and |, |h| dt < ¢/2, by Section D.12. One then has 


t t 
F(t) — F(s) < f lg(w)| du + [ |a(w)| du 
<n|t—s|+ : 
<n|t-—s|+-. 
2 
Now set 6 = e(2n)"*. If J,,..., J, are disjoint intervals in J with )7_, |b, — al < 6, 
then )2-, |f(6,) —S(a,)| < nd + ¢/2 = e. Hence fis absolutely continuous. J 


We omit the rest of the proof which can be found in the references cited above. 
However a word of caution to the reader is necessary. There do exist continuous 
functions F with the property that dF/dt exists almost everywhere but 


F(t) # F(a) + { Exo ds. 


In fact there does exist a strictly increasing continuous function F with the property 
that dF/dt = 0 (a.e.), see Riesz and Sz. Nagy [I, p. 48]. Needless to say such a 
function is not absolutely continuous. 


EXERCISES 


1. Ifa<b we define {f f(s) ds = —|2 f(s) ds. Show that if F(t) =|? f(s) ds for 
ast<b, where feL,[a,6], then F’ = —f (ae.). 


596 APPENDIX D. INTEGRATION AND MEASURE THEORY 


2. (Differential Equations) Let f(x,t) be a measurable function that is: continuous 
in x for each t. Assume there is an me L,[a,b] such that | f(x,t)| < m(t) for all 
x and ¢. 
(a) Show that if $(t), a<t<b, is an absolutely continuous function that 
satisfies 


b'(t)=f(O,t) (ae), and $(a)= 9, (D.13.3) 
then 


o(t)= do + [ £665) ds,a<t<b. (D.13.4) 


(b) Conversely, show that if @ satisfies (D.13.4), then @ is an absolutely con- 
tinuous function that satisfies (D.13.3). 


14. THE RADON-NIKODYM THEOREM 


The Radon-Nikodym Theorem is simply an extension of the results of the 
last section to arbitrary measures. We shall present this theorem here without proof. 
For our purpose, this theorem is used primarily to discuss the conditional expecta- 
tion operator in Section E.5. 

Let X be a nonempty set and let ¥ bea o-field of subsets in X, as in Example 2, 
Section 10. Recall that a measure on X is a mapping yp of ¥ into the extended real 
numbers R = RU {+00} U {—0o} such that n(@) = 0 and 


u( U Es) = 5 aE (D.14.1) 


whenever {E,,E,,...} iS a sequence of mutually disjoint sets in ¥%. Earlier we had 
considered only positive measures, but now we wish to consider measures pu of the 
form* 


w= pe, 
where p:* and pv” are positive measures. The absolute value of y is then the positive 
measure 
WJ= woth. 
We shall say that the measure yp is o-finite on X if there is a sequence {F,,E,,...} of 
measurable sets that satisfy |u|(E,) < oo and |), E, = X. If |u| (X) < 0, then 
we say that p is a finite measure. 
We note that the Lebesgue measure is o-finite on R and finite on any bounded 
interval. 
Now let » and v be two measures on (X,4). We shall say that v is absolutely 
continuous with respect to pu if |v| (EZ) = 0 whenever |p| (£) = 0. 


* One can show that every measure has this form, see Royden [1, pp. 202-206]. 


D.14. THE RADON-NIKODYM THEOREM 597 


EXAMPLE 1. Let fe L and let ¥ = Z denote the Borel sets in R. Let u denote 
the Lebesgue measure on R and for Ee & define v(E) by 


W(E) = | fdt= Alp dt. (D.14.2) 
E 
If f=f* —f~, then we see that 
yt(E) = ne dt, v(E)= ie dt. 


Furthermore v(@) = 0. Also, if {£,,E,,...} 1s a sequence of mutually disjoint sets, 
then 


Li Shen < 2 If\ Xe, < l. 


So by the Dominated Convergence Theorem one has 


( U Ea) = Js( % te.) dt= > [fice a 


ee 3 v(E,). 
n=1 
Hence v is a measure on &. Finally, we note that v is absolutely continuous with 
respect fo y. 
The Radon-Nikodym Theorem, which we state next, tells us that every finite 
measure v that is absolutely continuous with respect to the Lebesgue measure yu 
must satisfy (D.14.2) for some fin L. J 


D.14.1 THEOREM. (RADON-NIkODyYM.) Let u be a o-finite measure on (X,F) 
and let v be a measure that is absolutely continuous with respect to yu. Then there is 
a measurable function f such that 


W(E) = i f)u(dx) (D.14.3) 
forall Ee F. Furthermore v is finite if and only if fe Ly(X,F ,p). 


The function fappearing in (D.14.3) is called the Radon-Nikodym derivative of v 
with respect to yu. 


EXAMPLE 2. Onecan reduce the theory of the last section to a special case of 
the Radon-Nikodym Theorem. We shall illustrate this reduction by considering a 
monotone increasing function F that is absolutely continuous. 
One can construct a measure v on the Borel sets # by using F. First, for each 
interval E with end points {a,b}, where a < b, we let 


WE) = F(b) — F(a). 


598 APPENDIX D. INTEGRATION AND MEASURE THEORY 


Next we extend v to countable unions of intervals by using (D.14.1). Continuing 
this way, by repeated use of (D.14.1), we can extend v to all of &. 

Finally the fact that the measure v is absolutely continuous with respect to the 
Lebesgue measure p is a direct consequence of the definition of absolute conti- 
nuity for functions and the characterization of a null set appearing in Exercise 1, 
Section 5. J 


15. FUBINI THEOREM 


We close this appendix with an abbreviated statement of the Fubini Theorem, 
which gives conditions under which one can interchange the order of integration. 
For this we are interested in functions of the form f(x,y) defined on A x B and 

we ask: When is it true that 


} ffs ay) dx = f \[ son ax | dy? (D.15.1) 


One can show (Royden [1, pp. 233-234]) that if f(x,y) is a measurable function that 
satisfies either one of the conditions: 


(i) fe L(A x B); 
(i) f= 0; 


then (D.15.1) is valid. 


SUGGESTED REFERENCES 


Asplund and Bungart [1] Royden [1] 
Halmos [1] Rudin [1] 
Loomis [1] 


Appendix E 


Probability Spaces 
and Stochastic Processes 


1. PROBABILITY SPACES 


Probability spaces are the basic mathematical models for random phenomena. 
Intuitively the situation is as follows. Suppose one is given a collection of random 
phenomena. One then pictures that off someplace in the background is a nonempty 
set Q, which is called the sample space. On each experiment, the “‘ forces that be”’ 
pick one of the sample points wm € Q. This sample point @, in turn, determines the 
values associated with all the random phenomena in that particular experiment. 

In addition one 1s also interested in certain events, which are merely subsets, or 
perhaps, distinguished subsets of Q. The relationship w € A means that the event A 
*‘ occurred ’’ during the given experiment. The probability that the event A will occur 
is then a real number P(A) satisfying 0 < P(A) < 1. 

Let us now make this intuitive picture more precise. For this purpose, we will 
assume that the reader is familiar with Appendix D, in general, and Section D.10, in 
particular. 

A probability space is a triple (Q,¥,P), where Q 1s a nonempty set, F is a 
o-field of subsets of QO and P is a positive measure with P(Q) = 1. In this case P 1s 
sometimes called a probability measure. 

Before we look at some examples of probability spaces let us list here a partial 
dictionary relating certain measure-theoretic and probability-theoretic terms 


Probability Space Measure Space 
Sample Point Point in the Space 
Event Measurable Set 
Probability Measure 

Sure Event Whole Space 
Impossible Event Empty Set 

Event with Probability Zero Set of Measure Zero 
Almost Sure Almost Everywhere 
Random Variable Measurable Function 
Expectation Integral 


EXAMPLE |. Let Q denote the collection of all possible initial moves for white 
in a chess game. Thus Q consists of 20 points. The reader should easily think of 
several candidates for probability measures on Q. J 


EXAMPLE 2. Let {2 denote the collection of all possible outcomes of an experi- 
ment involving 50 flips of a coin. A typical sample point, then, would be an ordered 


599 


600 APPENDIX E. PROBABILITY SPACES AND STOCHASTIC PROCESSES 


50-tuple (H,H,T,...,H) consisting of heads (#7) and tails (7). Let p denote the 
probability of getting H on any toss and gq = 1 — p the probability of getting 7. Let 
A be the event consisting of all outcomes with 1 heads and 50 — » tails. It is well 
known that 


P(A) a3 n 50-n | 


50! 
(50 — n)!n! ea 


EXAMPLE 3. Let Q= Rand let fe L,(— 0,00) with J®,, | f|? dt = 1. Then 


P(A) = | If |? at 


is a probability measure on Q. This type of probability space occurs 1n the study of 
quantum mechanics. J 


EXAMPLE 4. Let Q = R? and using polar coordinates define f: R? — R by 


foe | V203"r xp(5)] - 
Then 
P(A) = { } fdA 


is a probability measure onQ. J 


pa RANDOM VARIABLES AND DISTRIBUTION FUNCTIONS 


A probability space is an abstraction that appears only implicitly in many 
applications. The concrete formulation of these problems is usually in terms of 
certain random variables and their distributions. 

A (real) random variable is simply a real-valued measurable function defined on 
the sample space Q. (See Example 2, Section D.10.) A (complex) random variable 
is a complex-valued function whose real and imaginary parts are real random 
variables. 

Although the definition of a random variable is in terms of a function defined 
on the sample space, in many applications one is oftentimes only interested in the 
distribution function of a random variable. 

Let XY be a real random variable defined on a probability space (Q,F,P). The 
(probability) distribution function of X, denoted by F(x), is 


F(x) = P[X(w) < x]. 
That is, F(x) is the probability of the event 
A={weN: X(a) < x}. 


For complex random variables, both the real and imaginary parts have probability 
distribution functions. 


E.2. RANDOM VARIABLES AND DISTRIBUTION FUNCTIONS 601 


EXAMPLE |. Let © denote the 36 possible outcomes of rolling two dice. 
Assume that each outcome is equally likely, so that the probability measure is 
merely P(A) = card(A)/36. Now let X be the random variable that assigns to each 
outcome the total points on the two dice. The distribution function for x is illustra- 
ted in Figure E.2.1. J 


F(x) 


x —> 


Figure E.2.1. Distribution Function for Example 1. 


Let us note a few properties of the distribution function. First we note that 
0 < F(x) <1 for all x and F(— oo) = 0 and F(o) = 1. Also, if x < y, then 


{a@EQ: X(w) < x} S {fwEN: X(a@) < y}. 
Hence F(x) < F()), that is, F is monotone increasing. Also, if x, < x,, then 
P[x, < X (@) < x2] — F(x) = F(x). 


If, in addition, F(x) is absolutely continuous (see Section D.13), then there is a 
function fe L,(— 0,00) such that 


F(x) = f : f(t) dt 


for all x. In this case we say that fis the (probability) density function for the random 
variable X. 


EXAMPLE 2. Consider Q=R. Let geL,(—,0), with f2,, |g|? dt =1, 
and let 


P(A) = { lal? dt 


be the probability measure on Q. Let X(t) =1¢ be a random variable on Q. The 
distribution function for X is 


F(x) = P[t <x] = fi igl? dt 


and the density function is | g(x)|?. We ask the reader to compute the distribution and 
density functions for the random variable X(t)=1t?. J 


602 APPENDIX E. PROBABILITY SPACES AND STOCHASTIC PROCESSES 


Let Y and Y be two real random variables on Q. The joint (probability) distribu- 
tion function is defined by 


F (x,y) = P[X(@) < x and Y(@) < y]. 
Similarly the joint distribution function for the real random variables {X,,...,X,} 
is given by 
FF (Xi5055005h,) = PLAS X65 = 5s Ay SX, 1 


J EXPECTATION 


As suggested in the first section, the expectation is the integral. Indeed, if X is 
a random variable we define the expected value to be E[X], where’ 


E[X] = { X(w)P(do). 
Q 
The expected value may not exist for all random variables. However, if 
EUXI] = | [X(@)|P(da) < ©, 


then E[ X] is finite and well defined. Thus, if X belongs to L,(Q,F,P), then X has a 
finite expected value. 

A real random variable X in L,(Q,F »P), that is, X satisfies E[|X|?] < 00, is 
said to have a finite second moment. Since? L,(Q,F,P) < L,(Q,F,P), we see that 
if X has a finite second moment, then it has a finite expected value. In this case, we 
define the variance o*(X) by 


o°(X) = E[|X — E[X]|7] 
and o(X) is called the standard deviation of X. 


If X and Y are two real random variables in L,(Q,¥,P), then the covariance of 
X and Y is defined by 
Cov(X, Y) ae E\(X 7 Lx)( ) y)I, 
where p, = ELX] and pw, = EL Y]. The correlation coefficient 1s, then, 
Cov(x, Y) 
p(x,Y) ae i 
0,0, 


where o,, and a, denote the standard deviation of X and Y, respectively. 


EXERCISES 


1. Show that L,(Q,F,P) <L,(Q,F,P). 
2. Let X be a complex random ariubied in L(Q,F,P). Show that 
E[|X — ELX]|*] = EL|X|*] — |ELX]|’. 
1 Again we assume familiarity with the integral as constructed in Appendix D, especially in Example 


2, Section D.10. 
2 See Exercise 1. 


E.4. STOCHASTIC INDEPENDENCE 603 


3. Let X and Y be real random variables in L,(Q,F,P). Show that 
Cow X,Y) = ELXY] — ux by, 
where pw, = ELX] and wp, = ELY]. 


4. STOCHASTIC INDEPENDENCE 


Let X and Y be real random variables on a probability space (Q,4%,P). We 
wish to introduce the concept of stochastic independence for these random varia- 
bles. Roughly, this means the values for X should somehow be independent of 
the values for Y. This concept can be made precise by using the distribution func- 
tions. Specifically, we say that XY and Y are (stochastically) independent if 


P[X(m) <x and Y(@) < y] = PLX(@) < x]: PLY@) <y] (E.4.1) 
for all x and y. Similarly, we say that a finite collection of real random variables 
{X,,...,X,} 18 (stochastically) independent if 

PLX, < X11; ey XxX, < x, ] =P X, | --- PLX, < x, | 


for all x,,..., x,. In general, an arbitrary collection of real random variables is 

said to be (stochastically) independent if every finite subcollection is independent. 
We say that two complex random variables X and Y are (stochastically) inde- 

pendent if each of the following four sets of real random variables are independent: 


{Re X, Re Y}, {Re X,Im Y}, 
{Im X, Re Y}, {Im X,Im Y}. 
In a similar way we define stochastic independence for finite and infinite collections 


of complex random variables. 
There is one very important fact concerning independent random variables. 


E.4.1 THEOREM. Let X and Y be stochastically independent random variables 
in L,(Q,F,P). Then 
E[XY]=E[LXJELY] and E[XY]=E[X]ELY]. 
We shall not prove this theorem here since it involves concepts not fully 
developed in this book. However, a proof can be found in Loeve [1]. 
Before we look at some examples let us note that if X and Y are real random 
variables, then they are stochastically independent if and only if 
PL X(@)>x and Y(@) > y] = PLX(@) > xJPLY(@) > y] (E.4.2) 


for all x and y. 


EXAMPLE |. Let A and B be two events in Q, and let X, and X, denote their 
characteristic functions, that 1s, 
X ,(w) = 1, wea 
= 0, wé A. 


604 APPENDIX E. PROBABILITY SPACES AND STOCHASTIC PROCESSES 


We claim that the random variables X, and X, are independent if and only if 
P(A cB) = P(A)P(B). (E.4.3) 
Indeed if (E.4.2) is valid for all x and y, then for x < 1 and y < 1 one has 
P(A a B)=P[X,(w) > x and X;(@) > y] 
= P[X4(@) > x] P[Xp(@) > y] = P(A)P(B). 
Conversely, if (E.4.3) is valid, then it is easy to see that by reversing the above 


reasoning (E.4.2) holds whenever x <1 and y< 1. However, the cases where 
x >1ory2>1 are trivially checked. J 


EXAMPLE 2. Let X and Y be independent random variables and let uw and v 
be two measurable functions. Then uw(Y) and v(Y) are independent random vari- 
ables. J 


We caution the reader not to confuse the concept of stochastic independence 
with the concept of linear independence introduced in Chapter 4. 


EXERCISES 

1. Let X and Y be nonzero stochastically independent random variables in 
L(Q,F ,P). 
(a) Show that Cov(X, Y) = 0. 


(b) Show that if E[X] = E[Y]=0, then XY and Y are linearly independent. 
[Hint: Let Z =aX + bY =0, then compute Cov(Z,X) and Cov(Z, Y).] 


2. Construct two linearly independent random variables X¥ and Y such that E[X] = 
E[Y]=0 and where X and Y are not stochastically independent. 


3. Let {X,,...,X,} be a collection of stochastically independent random variables. 
Let U=g(Xj,...,Xm) and V =A(X,,41,-.-,X,), where 1 <m<zn. Show that 
U and V are stochastically independent. 


5. CONDITIONAL EXPECTATION OPERATOR 


Let (Q,F,P) be a probability space and let Z be a subcollection of ¥. One 
says that Z is a sub-o-field if # itself is a o-field of Q. Let us now look at a few 
examples. 


EXAMPLE 1. Z={PW,Q}. J 
EXAMPLE 2. B=F. J 


Of course, these two examples represent the extreme cases. All other sub-o- 
fields lie in between. 


EXAMPLE 3. Let Ae ¥. Then B= {@,A,A',Q} is the o-ficld generated 
by A. § 


E.5. CONDITIONAL EXPECTATION OPERATOR 605 


EXAMPLE 4. Let A, Be ¥, where A A B= @, and let 
C=(AVUBY=A OB. 
Then 
B ={O,A,B,C,A’, B,C’ Q} 


is the sub-o-field generated by A and B, that is, Z is the smallest o-field containing 
A and B. (Describe the sub-o-field generated by A and B when one does not assume 
that AN B=@.) J 


The sub-o-fields we shall be primarily interested in are described in the follow- 
ing example. 


EXAMPLE 5. Let Y be a real random variable on (Q,¥,P). This means that 
for every Borel set A in R, the set Y~*(A) € F. Now let Z& be the smallest o-field in 
¥ that contains all events of the form Y~‘'(A), where A is a Borel set in R. ThenZ 
is said to be the (sub)-o-field generated by Y. 

For example, if Y = X, is the characteristic function of an event A, then # is as 
given in Example 3. Or, if Y= aX, + BX,, where0 <a < Pp and An B= @, then 
B is given in Example 4. J 


We can now define conditional expectation. First let 2 be a sub-o-field of F 
and let Ye L,(Q,F,P). Then define v(B) by 


v(B) = | X(@)P(do), BeZ. 


It is easy to see that v is a measure on & and that it is absolutely continuous with 
respect to P,, which is the restriction of P to Z. So by that Radon-Nikodym 
Theorem there is a (unique) random variable E*[ X ] that is measurable? with respect 
to Z and that satisfies 


v(B) = { E%CX](@)P9(dw) = { X(o)P(do), (E.5.1) 
or, as it is sometimes written, 
{ E*[X] dP = X aP. 


This random variable E*[ X] is called the conditional expectation of X with respect 
to B. \f B is the a-field generated by a random variable Y, then we shall denote 
E®[X] by E*[X] and call it the conditional expectation of X with respect to Y. 


3 There is subtlety here which should not be overlooked. The first random variable X is measur- 
able with respect to ¥, whercas E4[X] is measurable with respect to #, That is, X~'(A)e F and 
E*[X]-'(A) ¢ 4 forall Borel sets in R. If X itself is measurable with respect to 4, then the unique- 
ness part of the Radon-Nikodym Theorem would tell us that X¥  £4[X’). 


606 APPENDIX E. PROBABILITY SPACES AND STOCHASTIC PROCESSES 


ExAMPLeE 6. If # = {@,Q}, then E*[X] = E[X’], that is, E* maps X onto 


the constant function E[X]. J 


EXAMPLE 7. If @=¥, then E*[X]=¥X. J 
EXAMPLE 8. If @ = {w,A,A’,Q} and if 0 < P(A) <1, then 


E*[X\(@) => ra is X(w)P(dw), —sif ape A, 


- aa [ X(@)P(do), ifw eA’. 


EXERCISES 


l. 


10. 


Let X =c =constant (a.e.), that is, almost everywhere. Show that 


E°[X] =c (ae.). 


. Show that E® is a linear operator on L,(Q,F,P). 
. Show that if X¥ < Y (a.e.), then E*[X] < E*[Y] (a.e.). 
. Show that 


—E*[|X|] < E*[X] < E7[ |X|] (a.e.). 


. Show that [E#[X]| < E*[|X|] (a.e.) and hence, E* maps L,(Q,¥,P) into 


L,(Q,2,P). 


. Show that E® is a projection on L,(Q,F,P). 
. Show that E® maps L,(Q,F,P) into L,(Q,F,P) and that E is a projection 


on this space. 


. (Dominated Convergence.) Assume that lim X, = X (a.e.), where |X,| < Y(a.e.). 


Show that if Ye L,(Q,F,P), then lim E*[X.] = = E*[ X] (a.e.). 


. Show that 


{ E[|X|?] dPg = { |X|? dP (E.5.2) 
Q Q 


for all X in L,(Q,F,P). [Hint: First show that (E.5.2) holds for simple func- 
tions and then pass to the limit.] 


Let ¥, Ye L,(Q,F,P) and assume that X is measurable with respect to #. 
Show that 


E*(XY] = XELY]. (E.5.3) 


(Hint: First show that (E.5.3) holds when X is a simple function and then pass 
to the limit.] 


E.6. STOCHASTIC PROCESSES 607 


ll. Let X¥, Ye L,(Q,F,P). Show that 
E*(E®(X]Y] = E*(XJE*LY]. 
12. Discuss E*[ X] when YX is complex-valued. Show that the above properties are 
valid even in this case. 
6. STOCHASTIC PROCESSES 


A Stochastic process is simply a family of random variables X,, where ¢ lies 
in some index set. If t belongs to some interval on the real line, this is referred to as 
a continuous process. If f ranges over some countable set, this is referred to as a 
discrete process. In the later case one often writes X, in place of X,. 


EXAMPLE 1. For ¢ in the interval J, let X(t,w) be a random variable with 
finite second moment, that is 


ELX(t, PI = | 1X(t.0)/?P(da) < oo. 


Let us abbreviate the notation and write X(t) in place of X(t,). 
One says that X(f) is continuous in ¢ if for every ¢ > 0 there is a 6 > O such that 
if |h| < 6, then 


E[| X(t) — X(t + A)|*] < &?. 


Assume that E[_X(t)] = 0 for all ¢t in J. Now define the covariance function by 
r(t,s) = ELX()X(s)]. 


We leave it to the reader to show that r(t,s) is continuous in ¢ and s when X(f) 1s 
continuous inf. f 


SUGGESTED REFERENCES 


Doob [1] Kolmogorov [1] 
Feller [1] Loeve [1] 
Halmos [1] 


References 


S. AGMON 
[1] Lectures on Elliptic Boundary Value Problems. Princeton, N.J.: D. Van 
Nostrand Company, Inc., 1965. 
N. I. AKHIEZER AND I. M. GLAZMAN 
[1] Theory of Linear Operators in Hilbert Space. New York: Frederick Ungar 
Publishing Co., 1961. 
N. ARONSZAJN 
[1] ‘“‘Approximation techniques for eigenvalues of completely continuous 
symmetric operators.”’ Proceedings of the Symposium on Spectral Theory 
and Differential Problems. Stillwater: Oklahoma College, 1951, pp. 179-202. 
R. B. AsH 
[1] Information Theory. New York: John Wiley & Sons, Inc., 1965. 
E. ASPLUND AND L. BUNGART 
[1] A First Course in Integration. New York: Holt, Rinehart and Winston, Inc., 
1966. 
G. BACHMAN AND L. NARICI 
[1] Functional Analysis. New York: Academic Press, Inc., 1966. 
S. BANACH 
[1] Theorie des operations lineaires. New York: Chelsea Publishing Company, 
1955. 
R. G. BARTLE 
[1] The Elements of Real Analysis. New York: John Wiley & Sons, Inc., 1964. 
L. Bers, F. JOHN, AND M. SCHECHTER 
[1] Partial Differential Equations. New York: Interscience Publishers, 1964. 
A. S. BESICOVITCH 
[1] Almost Periodic Functions. New York: Dover Publications, Inc., 1954. 
F, BEUTLER 
[1] ‘Sampling theorems and bases in a Hilbert space,” Information and 
Control, 4, No. 2-3 (Sept. 1961), 97-117. 
P. BILLINGSLEY 
[1] Ergodic Theory and Information. New York: John Wiley & Sons, Inc., 1965. 
[2] Convergence of Probability Measures. New York: John Wiley & Sons, Inc., 
1968. 
R. P. Boas, JR. 
[1] A Primer of Real Functions. Washington, D.C.: Math. Assn. of America, 
1960. 


609 


610 REFERENCES 


S. BOCHNER 
[1] Fourier Transforms. Princeton, N.J Princeton University Press, 1949. 
F. E. BROWDER 
[1] ‘““Nonlinear mappings of nonexpensive and accretive type in Banach 
spaces,” Bull. Amer. Math. Soc., 73 (1967), 875-882. 
L. CESARI 
[1] Asymptotic Behavior and Stability Problems in Ordinary Differential 
Equations. Berlin: Springer-Verlag, 1963. 
E. CODDINGTON AND N. LEVINSON 
[1] Ordinary Differential Equations. New York: McGraw-Hill, Inc., 1955. 
R. COURANT AND D. HILBERT 
[1] Methods of Mathematical Physics. Vol. | and I]. New York: Interscience 
Publishers, 1953 and 1962. 
R. COURANT AND H. ROBBINS 
[1] What is Mathematics? London: Oxford University Press, 1941. 
M. DAMBORG AND A. NAYLOR 
[1] ““The fundamental structure of input-output stability for feedback 
systems,” JEEE Transactions on System Science and Cybernetics, Vol. 
SSC-6, No. 2 (April 1970). 
M. M. Day 
[1] Normed Linear Spaces. Berlin: Springer-Verlag, 1962. 
J. A. DIEUDONNE 
[1] Foundation of Modern Analysis. New York: Academic Press, Inc., 1960. 
J. L. Doos 
[1] Stochastic Processes. New York: John Wiley & Sons, Inc., 1953. 
N. DUNFORD AND J. SCHWARTZ 
[1] Linear Operators, Parts I and II. New York: Interscience Publishers, 1958 
and 1963. 
A. DVORETZKY AND C. ROGERS 
[1] “‘Absolute and unconditional convergence in normed linear spaces,” Proc. 
Nat Acad. Sci. U.S.A. 36 (1950), 192-197. 
R. E. EDWARDS 
[1] Functional Analysis. New York: Holt, Rinehart and Winston, Inc., 1965. 
[2] Fourier Series, Vol. l and Il. New York: Holt, Rinehart and Winston, Inc., 
1967. 
L. ENGEL 
[1] How to Buy Stocks, 4th edition. New York: Bantam Books, 1967. 
W. FELLER 
[1] An Introduction to Probability Theory and Its Applications, Vol. I and II. 
New York: John Wiley & Sons, Inc., 1957 and 1966. 
A. FRIEDMAN 
[1] Generalized Functions and Partial Differential Equations. Englewood Cliffs, 
N.J.: Prentice-Hall, Inc., 1963. 
[2] Partial Differential Equations. New York: Holt, Rinehart and Winston, Inc., 
1969. 


REFERENCES 611 


B. FUGLEDE 
[1] ““A commutivity theorem for normal operators,” Proc. Nat. Acad. Sci., 
U.S.A. 36 (1950), 35-40. 
P. R. GARABEDIAN 
[1] Partial Differential Equations. New York: John Wiley & Sons, Inc., 
1964. 
I. I. GIKHMAN AND A. V. SKOROKHOD 
[1] Introduction to the Theory of Random Processes. Philadelphia: W. B. 
Saunders Company, 1969. 
A. M. GLEASON 
[1] °° Measures on the closed subspaces of a Hilbert space,” J. Math Mech. 
6 (1957), 885-893. 
C. GOFFMAN AND G. PEDRICK 
[1] First Course in Functional Analysis. Englewood Cliffs, N.J.: Prentice Hall, 
Inc., 1965. 
R. R. GOLDBERG 
[1] Fourier Transforms. New York: Cambridge University Press, 1962. 
S. GOLDBERG 
[1] Unbounded Linear Operators. New York: McGraw-Hill, Inc., 1966. 
P. HALMOS 
[1] Measure Theory. Princeton, N.J.: D. Van Nostrand Company, Inc., 1950. 
[2] Lectures on Ergodic Theory. New York: Chelsea Publishing Company, 1956. 
[3] Introduction to Hilbert Space. New York: Chelsea Publishing Company, 
1957. 
[4] Finite Dimensional Vector Spaces. Princeton, N.J.: D. Van Nostrand 
Company, Inc., 1958. 
[5] Naive Set Theory. Princeton, N. J.: D. Van Nostrand Company, Inc., 1960. 
G. H. Harpy, J. E. LITTLEwoop, AND G. POLYA 
[1] Inequalities. New York: Cambridge University Press, 1952. 
F. HAUSDORFF 
[1] Mengenlehre. New York: Dover Publications, Inc., 1944. 
G. HELLWIG 
[1] Differential Operators of Mathematical Physics. Reading, Mass.: Addison- 
Wesley Publishing Company, Inc., 1967. 
E. HEwITT 
[1] “‘The role of compactness in analysis,’ Amer. Math. Monthly 67 (1960), 
499-516. 
E. HEWITT AND K. STROMBERG 
[1] Real and Abstract Analysis. Berlin: Springer-Verlag, 1965. 
D. HILBERT 
[1] Grundztige Einer Allgemeinen Theorie der Linearen Integralgleichungen. 
New York: Chelsea Publishing Company, 1952. 
E. HILL AND R. S. PHILLIPS 
[1] Functional Analysis and Semi-Groups. Providence, R.I.: American Mathe- 
matical Society, 1957. 


612 REFERENCES 


J. INDRITZ 
[1] Methods in Analysis. New York: The Macmillan Company, 1963. 
J. M. JAUCH 
[1] ‘‘ Theory of the scattering operator,” Helv. Phys. Acta 31 (1958), 127-158 
and 661-684. 
[2] Foundations of Quantum Mechanics. Reading, Mass.: Addison-Wesley 
Publishing Company, Inc., 1968. 
T. KATO 
[1] ““On the existence of solutions of the helium wave equations,” Trans. 
Amer. Math. Soc., 70 (1951), 212-218. 
[2] Perturbation Theory for Linear Operators. Berlin: Springer-Verlag, 1966. 
J. KELLEY 
[1] General Topology. Princeton, N.J.: D. Van Nostrand Company, Inc., 1955. 
A. N. KOLMOGOROV 
[1] Foundations of the Theory of Probability. New York: Chelsea Publishing 
Company, 1950. 
A. N. KOLMOGOROV AND S. V. FOMIN 
[1] Elements of the Theory of Functions and Functional Analysis, Vol. 1 and I. 
Albany, N.Y.: Graylock, 1957 and 1961. 
M. A. KRASNOSEL’SKII AND YA. B. RUTICKII 
[1] Convex Functions and Orlicz Spaces. Groningen, Netherlands: P. Noord- 
hoff, N.V., 1961. 
C. LANCZOS 
[1] Linear Differential Operators. Princeton, N.J.: D. Van Nostrand Company, 
Inc., 1961. 
E. B. LEE AND L. MARKUS 
[1] Foundations of Optimal Control Theory. New York: John Wiley & Sons, 
Inc., 1967. 
M. LOEVE 
[1] Probability Theory. Princeton, N.J.: D. Van Nostrand Company, Inc., 1960. 
L. H. Loomis 
[1] An Introduction to Abstract Harmonic Analysis. Princeton, N.J.: D. Van 
Nostrand Company, Inc., 1953. 
E. R. LorcH 
[1] Spectral Theory. New York: Oxford University Press, 1962. 
[2] *“‘ The spectral theorem.” in Studies in Mathematics, Vol. 1. Math. Assn. 
Amer., 1962. 
W. MAAK 
[1] An Introduction to Modern Calculus. New York: Holt, Rinehart and 
Winston, Inc., 1963. 
N. MEYERS AND J. SERRIN 
[1] ““H = W,” Proc. Nat. Acad. Sci. U.S.A. 51 (1964), 1055-1056. 
S. G. MIKHLIN (editor) 
[1] Linear Equations of Mathematical Physics. New York: Holt, Rinehart and 
Winston, Inc., 1967. 


REFERENCES 613 


M. A. NAIMARK 
[1] Normed Rings. Groningen, Netherlands: P. Noordhoff, N.V., 1960. 
Z. NEHARI 
[1] Conformal Mapping. New York: McGraw-Hill, Inc., 1952. 
E. D. NERING 
[1] Linear Algebra and Matrix Theory. New York: John Wiley & Sons, Inc., 
1963. 
J. VON NEUMANN 
[1] The Mathematical Foundations of Quantum Mechanics. Princeton, N. J.: 
Princeton University Press, 1955. 
L. NEUSTADT 
[1] “Minimum effort control,” J. SIAM Control, 1 (1962), 16-31. 
R. E. A. C. PALEY AND N. WIENER 
[1] Fourier Transforms in the Complex Domain. Providence, R.J.: American 
Mathematical Society, 1934. 
W. A. PORTER 
[1] Modern Foundations of Systems Engineering. New York: The Macmillan 
Company, 1966. 
Yu. V. PROKHOROV AND M. FISsz 
[1] “‘A characterization of normal distributions in Hilbert space,” Theory 
Prob. and Its Appl. 2 (1957), 468-469. 
R. PROSSER AND W. ROOT 
[1] ‘“‘The e-entropy and e-capacity of certain time-invariant channels,”’ J. 
Math. Anal. Appl. 21 (1968), 233-241. 
F. RIESZ AND B. Sz.-NAGY 
[1] Functional Analysis. New York: Frederick Ungar Publishing Co., 1955. 
H. L. ROYDEN 
[1] Real Analysis. New York: The Macmillan Company, 1963. 
W. RUDIN 
[1] Principals of Mathematical Analysis, 2nd edition. New York: McGraw-Hill, 
Inc., 1964. 
L. SCHWARTZ 
[1] Théorie des Distributions, Vol. 1 and II. Paris: Hermann & Cle, 
1951. 
G. R. SELL AND H. WEINBERGER 
[1] ‘‘ Periodic behavior in a food chain,” (to appear). 
G. F. SIMMONS 
[1] Introduction to Topology and Modern Analysis. New York: McGraw-Hill, 
Inc., 1963. 
S. L. SOBOLEV 
[1] Applications of Functional Analysis in Mathematical Physics. Providence, 
R.1].: American Mathematical Society, 1963. 
M. H. STONE 
[1] Linear Transformations in Hilbert Space. Providence R.I.: American 
Mathematical Society, 1932. 


614 REFERENCES 


A. E. TAYLOR 
[1] Advanced Calculus. Waltham, Mass.: Blaisdell Publishing Company, 1955. 
[2] Introduction to Functional Analysis. New York: John Wiley & Sons, Inc., 
1958. 
J. G. TRUXAL 
[1] Automatic Feedback Control System Synthesis. New York: McGraw- 
Hill, Inc., 1955. 
A. WILANSKY 
[1] Functional Analysis. Waltham, Mass.: Blaisdell Publishing Company, 1964. 
R. L. WILDER 
[1] Jntroduction to the Foundations of Mathematics. New York: John Wiley & 
Sons, Inc., 1952. 
D. YOULA, L. CASTRIOTA, AND H. CARLIN 
[1] ‘‘ Bounded real scattering matrices and foundations of linear passive net- 
work theory,’ JRE Trans. on Circuit Theory (March 1959). 
A. C. ZAANEN 
[1] Linear Analysis. Groningen, Netherlands: P. Noordhoff, N.V., 1953. 
A. ZYGMUND 
[1] Trigonometric Series, Vol. I and II. New York: Chelsea Publishing 
Company, 1952. 


Index of Symbols 


A closure of A, 104 

A UB_ union, 14 

Af B intersection, 15 

A—B_ difference of sets, 15 

A {A.B symmetric difference, 15-16 

A XB _ Cartesian product, 17 

A1B_ orthogonal sets, 283 

a.e. almost everywhere, 569 

APo(T) approximate point spectrum, 
413 

B,[x |] closed ball, 77 

B,(X.) open ball, 77 

BC(I) bounded continuous 
tions, 52, 115-116 

BC(X,Y)  bounded-continuous func- 
tions, 119, 219 


func- 


Blt{|X,Y] bounded linear transforma- 
tions, 247 
BV(1I) functions of bounded varia- 


tion, 220, 233 

C complex numbers, 18 

C™ complex n-space, 48, 106, 119, 
218, 278 

Card(X) cardinality, 552-555 

ya characteristic function, 582 

Cov(X,Y) covariance, 602 

Co(T) continuous spectrum, 412 

C[0,T], C(—o,o), C(7T,R)  con- 
tinuous functions, 12, 51, 53-54, 
115, 219, 278-279 

Cy, Co(R,R) continuous functions 
with compact support, 564 

C*(1) Hdlder-continuous functions, 
68, 154, 221 

C"(Q) functions with n-continuous 
derivatives, 25, 221, 280 

C°(Q)  infinitely-differentiable 
tions, 25, 221 

Co*(Q) C*-functions with compact 
support, 221, 591 

Co sequence space, 222 

c sequence space, 222 

2 (f) domain of the function f, 23 


func- 


diam(A) diameter, 46 

d(x,A) distance, 46 

d,(x,y) Lebesgue space metric, 51- 
52 

D* partial differential operator, 221, 
282, 364 

® directsum 196-200, 342-344 

ess. sup. essential supremum, 590 

e* exponential, 227 

E® conditional expectation operator, 
605 

E[X] expectation, 602 

“s strong convergence, 251-252 

F scalar field, 161 

@ empty set, 13 


H"(Q) Sobolev space, 281 
H,"(Q) Sobolev space, 281 
H™?(Q) Sobolev space, 282 
H,"?(Q) Sobolev space, 282 


H, analytic function space, 224 
inf infimum, 18-19 

= implies, 7 

«<> if and only if, 7-8 

f integral, 560 

J lower integral, 560 


f upper integral, 560 

fafdt integral, 583 

Je mollifier operator, 243 

log (I — A) | logarithm, 227 

L(P,f) lower sum, 559 

L" transpose of L, 210 

In[0,00), p(—o,0), 1<p<o, 
sequence space, 31, 49, 106-107, 
114-115, 218-219, 280-281 

It[X,Y] linear transformation 
170, 190 

LC  Lipschitz-continuous functions, 
68 


space, 


LC,, Lipschitz-continuous functions, 
68 

LC,,» conjugate of LC,,, 68-69 

L,L(—o,0) Lebesgue integrable 


functions, 572 


616 INDEX OF SYMBOLS 


L,I), 1<p< oo Lebesgue space, 
24, 51-52, 107, 115, 219-220, 279, 
589 


L.°(—o,0) Lebesgue space, 37 

L.(—io, ico) Lebesgue space, 53 

L2o(—ic, io.) Lebesgue space, 52— 
53 


L,(9,¥,P) space of random varia- 
bles, 281 

M V N _lIattice operation, 300 

M A N _Iattice operation, 300 

m(A) measure, 582 

M+ orthogonal complement, 292 

M*” matrix space, 224 

N natural numbers, 18 

NBV (I) normalized functions of 
bounded variation, 223, 233 

N'(T) null space, 166 

< ordering, 556 

P_ differentiation operator or momen- 
tum operator, 362-363, 496 

P(0,7] polynomial space, 106 

P;, momentum operator, 364, 496- 
497 

Po(T) point spectrum, 412 

Q rational numbers, 18 

Q position operator, 362—363, 495 

Q;, position operator, 364, 495 

q(x,y) sesquilinear functional, 277 

Q(x) quadratic form, 277 

R_ real numbers, 12, 18 

R? real plane, 4, 47-48, 97 

R” real n-space, 24, 48, 98, 106, 
119, 218 

ro(T) spectral radius, 434 

Ro(T)_ residual spectrum, 412 

R (f) range of the function f, 23, 
166 


R, inverse (resolvent) operator, 434 

R,A({a,b]) Riemann integrable func- 
tions, 560 

e(T) resolvent set, 412 

e(X,Y) correlation coefficient, 602 

sup supremum, 18 

S,[x0] sphere, 77 

o(T) spectrum, 412 

o(X) standard deviation, 602 

o°(X) variance, 602 

T* adjoint operator, 352 

T’ conjugate operator, 270-271 

T<S, T <S_ ordering for self-ad- 
joint operators, 370 

ZF topology, 87 


U(P,f) upper sum, 559 

V(Q)_ potential operator, 496 
V(A)_ span of A, 176-177 
V(f) total variation, 220 


V(f;a,t) total variation, 223 

W(T) numerical range, 434 

W"(Q) functions with weak deriva- 
tives, 282 

X' algebraic conjugate, 208 

X’ conjugate space, 270 

XxX XxX +: X X, Cartesian prod- 
uct, 17, 60 

Z integers, 18 

(x,X,) Fourier coefficient, 307 

(x,y) inner product, 272; ordered 
pair, 17 

(-,l),(x,/) linear functional, 205 

|{4||n,» norm on Cy”, 222, 281-282 

x ly orthogonal vectors, 283 

xRy relation, 20, 556 

x~y relation, 21 


Index 


Absolute continuity, 594-598 

Absolute convergence, 225 

Addition of vectors, 161, 217 

Adjoint operator, 332, 352-367, 447, 527, 
529-532 

Algebraic complement, 199-200, 205 

Almost everywhere, 569 

Almost periodic functions, 257, 320 

Almost sure, 599 

Amplification (see Gain) 

Annihilation operator, 532, 546 

Anticausal operator, 41, 356—357 


Approximation, 106, 110, 234, 253, 283, 
285-290, 302-303, 310-312, 346- 
347, 359, 381 


(See also Compactness; Separable space; 
Totally bounded set) 
Arzela-Ascoli Theorem, 148-151, 155-156 
Axiom of Choice, 556 
Axiomatic method, 6-7 


Baire null space, 49-50 

Baire Theorem, 120 

Balls, 77 

Banach space, 217, 267 

Bergman kernel, 351 

Bernstein polynomials, 109-110 

Bernstein Theorem, 552-553 

Bessel Inequality, 307-308, 315, 322 

Biharmonic operator, 511, 517, 522 

Biorthonormal sequence, 434-435 

Bohr almost periodic, 320 

Bolzano-Weierstrauss property, 144-146, 155 

Borel field, 588 

Boundary, 77, 110 

Boundary conditions, 
516-518 

Bounded set, 18, 46, 114, 134-135 

Bounded linear functional (see Continuous 
linear functional) 

Bounded variation, 220 

normalized functions of, 223 
Brownian motion, 505 


488-492, 498-505, 


Cantor set, 103, 111, 583-585 

Cardinality, 552-555 

Cartesian decomposition, 376-377, 464 

Cartesian product, 17 

Cauchy sequence, 112-114, 119-120 

Cauchy Test, 225 

Causal operator, 40-41, 133, 165, 169-170, 
203, 256, 337, 356-357, 372-373, 
441 

Cayley transform, 375, 533-539 

Chain, 556 

Characteristic function, 582 

Channel capacity, 480-481 


Closable operator (see Linear transforma- 
tion, closable) 
Closed, balls, 77, 102 
local neighborhoods, 78 
mapping, 108 
sets, 101-112, 116 
Closed Graph Theorem, 256 
Closed loop feedback system, 131-133 
Closed Set Theorem, 105 
Closed operator (see Linear transformation, 
closed ) 
Closure, 104-105 
C*-function, 25 
Co-dimension, 199-200, 205 
Commutation, integration and _ differentia- 
tion, 595 
integration and lim, 578, 579 
integration and summation, 577 
limit and continuous function, 74 
linear operators, 369, 375, 408, 466, 467 
order of integration, 598 
time-invariant operators, 199 
Compact closure, 143, 155 
Compact linear subspace in, 505-506 
Compact operator, 379-388, 435, 448-470, 
476-483, 486-487, S00-502, 508- 
510, 519-520 
Compact-normal resolvent, 487—489 
Compact-self-adjoint resolvent, 487-489, 
500-502, 508-509, 520, 523, 533 
Compact set, 142 
Compact space, 142-157 
Compact support, 221 
Compactness, in C”, 147 
Heine-Borel compact, 145—146, 155 
locally compact, 154 
in l,, 152 
in L,(Q), 152 
in normed linear space, 269-270 
in R, 146 
in R*, 147 
sequentially compact, 142-146, 155 
Compactness Theorem, 146 
Complement (see Algebraic complement; 
Set theory) 
Completely continuous operator (see Com- 
pact operator) 
Completeness, 112-120 
Completion, 120-125, 263, 350 
Conditional expectation, 294, 604-607 
Conditionally compact (see Compact 
closure ) 
Conjugate operator, 270-271 
Conjugate space, algebraic, 208-212 
normed, 270 
Conjugation mapping, 536 
Connected space, 95, 119 


618 INDEX 


Continuity and convergence, 74-76, 82-96 
Continuous function, 61-69, 74-76, 79-80, 
84-85, 87-88, 92, 95, 101, 108, 115, 
120, 146, 148 
extension, 111, 122 
maxima and minima of, 148 
piecewise, 71 
Continuous linear functional, 344-352 
Continuous time, 38 
Contractible, 95-96 
Contraction mapping, 125-134 
Contraction Mapping Theorem, 126-127 
Contradiction, proof by, 7 
Convex function, 218 
Convex set, 182 
Convolution operator, 40, 66, 236, 428 
(See also Time-invariant operator) 
Coordinate system (see Hamel basis; Ortho- 
normal basis) 
Correlation coefficient, 602 
Correspondence, one-to-one, 31 
Coulomb perturbation, 466 
Coulomb potential, 496, 541 
Countable (countably infinite), 13, 554 
Covariance, 602-604 
Covariance function, 473—474 
Covariance matrix, 475 
Covariance operator, 378 
Covering, 145 
Creation operator, 532, 546 


Daniell approach, 558-559, 566, 586-589 
Decreasing sequence of sets, 118 
Deficiency indices, 535 
Deficiency subspaces, 535 
Deformed plate, 517 
Delay (see Shift operator) 
De Morgan’s Laws, 17 
Dense set, 106 
Density function, 601 
Density operator, 390, 392, 435, 466, 542- 
543 
Derivative, strong and weak, 282 
Deviation, 541 
Diagonal matrix representation, 404, 461 
Diameter, 46 
Difference operators, 257, 415 
(See also Shift operator) 
Differential operators, ordinary, 25, 66-67, 


141, 167-168, 170-171, 228-229, 
242, 246-247, 259, 262, 348-349, 
364, 376, 423-427, 430, 470, 488- 


491, 498-505, 536, 543 
partial, 221, 242-243, 490-492, 510-516, 
536, 540 
Differentiation, 593-596 
Dimension, 184-187, 200, 269, 319, 340 
Dini’s Theorem, 565-566 
Dirac function, 69, 122-124 


Direct sum, of inner product spaces, 342- 
344 
of linear spaces, 196-200 
Dirichlet problem, classical, 516-522 
classical solution of, 518 
generalized, 518 
weak solution of, 518 
Disconnected space, 95, 107 
Discrete time, 38 
Dispersion, 392 
Distribution function, 600, 602 
Distribution theory, 222 
Dollar cost averaging, 182 
Dominated Convergence Theorem, 579 
Dynamical equations, 391, 540 


Eigenfunction (see Eigenvector) 
Eigenmanifold, 400, 411, 435, 449, 453 
Eigenvalue, 400, 402, 406~409, 411, 435, 
453, 455, 457-458, 461-468, 493- 
495 
Eigenvalue-eigenvector 
520 
Eigenvalue-eigenvector representation, 460— 
461 
Eigenvector, 400, 408-409, 411, 453, 493- 
495 
Electrostatic field, 517 
Elliptic operator, 510-516, 519-522 
e-net, 136 
Equi-continuous, 149-151, 155 
Equivalence class, 21 
Equivalence relation, 20-22, 557 
Error criteria, 47, 89 
Essential supremum, 51, 220, 590 
Essentially self-adjoint operator, 538-539 
Euclidean distance, 4, 215 
Event, 599 
Event with probability zero, 599 
Eventually in, 80 
Existence and uniqueness theorem, ordinary 
differential equations, 129-131 
Volterra integral equations, 134 
Expectation, 599, 602-603 
Expected value, 391-392, 458 
Extended real numbers, 18 
Extension, continuous functions, 111, 122 
linear transformations, 170, 241, 250 
self-adjoint, 534 
Exterior, 110 


problem, 399-401, 


Fatou Theorem, 578, 580 

Finite-dimensional linear space, 185-187, 
208, 264-270, 387, 402-409, 414-415 

Fixed points, 126 

Formally self-adjoint (see Symmetric oper- 
ator) 

Fourier coefficients, 307, 320, 327 

Fourier-Plancherel transform, 361, 367 

(See also Fourier transform ) 


Fourier series (classical), multi-variable, 
326-327 
one variable, 322-325, 328 
Fourier series expansion, 307, 327 
Fourier Series Theorem, 307, 311-312, 315, 
332-333 
Fourier transform, applications, 133, 203- 
204, 257, 335-339, 388, 513-514 
basic theory, 334-335, 360-366 
on R’, 360-363 
on R", 363-364 
Fredholm alternatives, 462—463, 465, 467- 
468 
Fubini Theorem, 598 
Function(s), 22-38 
absolutely continuous, 594—598 
Bohr almost periodic, 320 
bounded variation, 220 
characteristic, 582 
class C*, 25 
closed, 108 
compact support, 221 
composition of, 23-24 
constant, 23 
continuous (see Continuous function) 
contraction, 125-134 
convex, 218 
covariance, 473~474 
density, 601 
Dirac, 69, 122—124 
distribution, 600, 602 
domain of, 23 
equi-continuous, 149-151, 155 
essentially bounded, 220, 590 
even, 198 
extension, 23 
graph, 24 
Green’s, 488-493, 500-505 
Holder continuous, 68-69, 154, 221 
identity, 23 
image, 23 
inverse, 29-38 
inverse set function, 27, 31 
invertible, 29-38 
joint distribution, 602 
Kronecker (delta), 25 
Lebesgue integrable, 572 
left inverse, 32—33 
linear (see Linear transformation) 
Lipschitz continuous, 68-69, 111 
measurable, 582-583, 589 
modulus of continuity, 62 
monotone, 34 
nonlinear, 165 


odd, 198 
One-to-one, 23, 31 
onto, 23 

open, 88, 91 
pre-image, 23 
ringe, 23 


INDEX 619 


representation, 23 

restriction, 23 

right inverse, 33 

simple, 587 

step, 575, 586 

strictly decreasing, 34 

strictly increasing, 34 

uniformly continuous, 62, 117, 120, 155, 

241 

uniformly equi-continuous, 149, 155 

Fundamental Theorem of Calculus, 594 


Gain, 133, 240, 249 

Garding’s Inequality, 505—513 

Geometric analysis (see Spectral analysis; 
Structure, geometric) 

Gram-Schmidt orthogonalization process, 
312-313 

Graph, 24, 531 

Greatest lower bound (see Infimum) 

Green’s formula, 497 

Green’s functions, 488—493, 500-505 

Gronwall inequality, 141 


Half space, 206 
Hamel basis, 183-187, 190, 200, 208, 218, 
264-265, 315, 319, 557 
Hamiltonian equations, 539 
Hankel transform, 366 
Harmonic oscillator, 543-546 
Heat equation, 523-524 
Heine-Borel compact, 145-146, 155 
Heisenberg commutation property, 543 
Heisenberg Uncertainty Theorem, 541-543 
Helium atom, 541 
Hermite, functions, 137, 327, 544—546 
generating function for polynomials, 545 
polynomials, 137, 192, 327-328, 544-545 
Hilbert cube, 103, 108, 112, 116 
Hilbert dimension, 319, 340 
Hilbert-Schmidt norm, 387 
Hilbert-Schmidt operator, 387 
Hilbert space, 272—282, 296-298, 301-304, 
307-312, 344-346 
proper functional, 350 
reproducing kernel, 350 
Holder, coefficient, 68, 221 
continuity, 68-69, 154, 221 
space, 221, 223, 510 
Holder Inequality, 548-551, 591 
converse of, 591 
Homeomorphic spaces, 93-94, 97-101 
Homeomorphism, 93, 153 
homotopically equivalent, 101 
uniform, 117 
Hydrogen atom, 541 
Hyperplane, 206-207 


Implication, 7 
Implicit Munction Theorem, 133 


620 INDEX 


Impossible event, 599 
Impulse response, 24 
Infimum, 18 
Infinite, countably, 13, 554 
uncountably, 13, 554 
Infinite series, 
absolute convergence, 225-229 
convergence, 224-229, 257, 
315-319 
divergence, 225 
of Operators, 
strong convergence, 251, 257 
uniform convergence, 250—251, 256 
unconditional convergence, 226-227 
Infinitesimal generator, 378 
Inner product, 272 
Inner product space, 273, 275, 294-295, 332 
completion of, 350 
(See also Hilbert space) 
Integral operator, continuous Kernel, 379, 
436, 463, 465, 473-474, 488 
convolution, 26-27, 34-37, 40-41, 167- 
168, 191, 203-204, 236, 242, 411, 
414, 428-429, 478-481 
with L:-kernel, 65-66, 124, 140, 236, 247- 
248, 354-355, 369, 382, 385, 392, 
407, 465, 467, 482, 488 
other types, 234-236, 376, 384, 386, 414, 
437, 482, 536 
Volterra type, 26-27, 34-37, 41, 63-66, 
167-168, 171, 203-204, 238, 355, 
413-414, 428-429, 436, 478-480, 482 
Integral test, 228 
Interchange (see Commutation) 
Interior, 77, 110 
Inverse, 29 
continuity, 66 
defined on the range, 31 
left, 32, 38 
right, 33, 38 
Inverse Mapping Theorem, 134 
Inverse operator, 332 
Inverse image mapping, 27 
Inverse set function, 27 
Isometric isomorphism, 258, 332, 350, 358, 
387 
Isometric spaces, 94, 117, 121 
Isometrically isomorphic, 258 
Isometry, 94, 96 
Isomorphic, 173, 185, 199-200 
Isomorphism, 173-176, 181, 199-200 
topological, 257 
Isoperimetric Theorem, 329-330 
Iteration, 126 


308-309, 


Joint distribution function, 602 


Karhunen-Loeve expansion, 473-476, 505 
Kronecker (delta) functions, 25 


Laguerre, functions, 327, 330-331, 545 
polynomials, 327, 330, 385, 504 

Laplacian operator, 490, 492, 497, 511-512, 

516-517 

Lattice, 300 

Lax-Milgram Theorem, 346, 350 

Least upper bound (see Supremum) 

Lebesgue integrable function, 572 

Lebesgue integral, alternate definitions, 586— 


589 
compared with Riemann integral, 572, 
581, 585 


definition, 572, 581-583 
properties, 573-583 
Lebesgue measure, 582-585, 588 
Lebesgue spaces, completeness of, 589-—591 
definition, 589 
properties of, 589-593 
Legendre polynomials, 191-192, 321, 329, 
504 
Levi Theorem, 577, 581 
Linear combination, 176 
Linear dependence, 176-183 
Linear functional, 204-208, 344-351 
continuous, 344-352 
Linear independence, 176-183, 306 
Linear mapping (see Linear transformation) 
Linear operator (see Linear transformation) 
Linear space, 161-165, 241 
complex, 162 
real, 162 
sum and direct sum, 197 
Linear subspace(s), closed, generated by a 
set, 311 
compact in, 505, 506 
compatibility, 305 
definition, 162 
dense in L,, 591 
disjointness of, 198-201 
finite dimensional, 267 
invariant, 356-357 
of normed linear space, 229-234, 267 
reducing, 357 
shift invariant, 179, 298 
spanning set, 176-179, 182-183 
sum of, 196-200, 340-344 
total, 212 
Linear transformation(s), adjoint (see Ad- 
joint operator) 
bounded, 239-240, 352-354 
bounded below, 244 
Cartesian decomposition, 376-377, 464 
closable, 529, 532 
closed, 241-242, 529-533 
closure of, 532 
compact (see Compact operator) 
continuous, 234-257, 268, 270, 380, 444 
continuous inverse, 243-247, 257-264, 
268 


Linear transformation(s) (continued) 
definition, 165—171 
densely defined, 486 
dimension of range and null space, 186 
discontinuous, 237, 241, 486 
equivalence of, 192-195 
extension, 170, 241 
with finite dimensional range, 401~—409, 

414-415, 430 

invariant subspace, 356-357 
inverse, 171-176, 243-247 
invertible, 171-172 
isomorphically equivalent, 193-195 
left or right inverse, 173 
matrix representation, 188-192 
norm of, 248-249, 252-253, 255, 465 
normal (see Normal operator) 
null space of (see Null space) 
one-to-one, 171, 211-212, 445 
polar decomposition of, 377, 464 
range of (see Range) 
reduces, 357 
self-adjoint (see Self-adjoint operator ) 
similar, 193, 195 
transpose, 208-212 

Lipschitz condition, 594 

Lipschitz continuity, 68-69, 111 

Local neighborhoods, 77-82, 85 

Logical principals, direct proof, 8 
if and only if statements, 7-8 
mathematical induction, 8—9 
principal of contradiction, proof by con- 

tradiction, 8 

Low-pass filter, 202 

Lower bound, 18 

Lower integral, 560 

Lower sum, 559 


Matched filter, 471—473 
Mathematical induction, 8—9 
Mathematical modeling, 5—7, 168 
Matrix multiplication, 190 
Matrix operator, 31-34, 62-63, 66, 91, 134, 
188-192, 203, 209-212, 252-253, 
354, 368, 376, 378, 383, 387, 407- 
409 
Maximal element, 556 
Maximal orthonormal set, 306, 441 
Maximum, 18 
Mean Ergodic Theorem, 291 
Measurable function, 582-583, 589 
Measurable set, 582, 587-588 
Measure, absolutely continuous, 596 
definition, 586-587 
finite, 596 
positive, 586 
probability, 599 
a-finite, 596 
Measure spice, 586--589 


INDEX 621 


Memoryless Operator, 41 
Metric, 45, 216-218 
equivalent, 84, 87, 95 
pseudo, 47 
Metric space, 45-156, 216-218 
complete, 113-120, 143-144, 155, 217 
completion, 120-125, 263 
connected, 95, 107 
disconnected, 95 
product space, 58-61, 119, 147 
subspace, 56-58 
Minimum, 18 
Minimum point, 290 
Minkowski Inequality, 548-551 
Mixture state, 392 
Modulus of continuity, 62 
Mollifier operator, 243, 506-508 
Momentum operator, 496-498, 531, 540 
Monotone Convergence Theorem, 577-578 
Multiplicity (of eigenvalue), 411 
Multiplier operators, 24, 64, 67, 168, 196—- 
197, 242, 245-246, 259, 355-356, 
358, 370, 376, 382-383, 430, 495-— 
498 


Neighborhood, 81-82 
(See also Local neighborhood) 
Neighborhood system, 81-82 
Neuman series (Neuman expansion), 253— 
255, 257, 432 
Noise, 471-473, 480-481 
Nonlinear filter, 127-129 
Nonlinear transformation, 165 
Nonnormal compact operators, 476-483 
(See also Compact operator) 
Norm, 215-218, 274, 370-371 
equivalent, 261 
Normal operator, 367-368, 373-379, 398, 
448, 453-461, 486-487, 493 
Normal space, 108 
Normed conjugate space (see Conjugate 
space ) 
Normed linear space, 216, 218-224, 264- 
270 
completion, 263 
uniformly convex, 278 
Null set, 566-569, 582 
Null space, 166, 171, 186-187, 201-202, 
205, 211-212, 241, 301-304, 357- 
358, 365, 400, 402 
Number systems, complex numbers, C, 18 
extended real numbers, 18 
integers, Z, 18 
natural numbers, N, 18 
rational numbers, Q, 18 
real numbers, R, 18 
Numerical range, 434 


Observable, 390, 458, 540-541 


622 INDEX 


Open, balls, 77 
covering, 145, 155 
local neighborhoods, 77—79 
mapping, 88, 91 
sets, 86-96 
subcovering, 145, 155 
Open Mapping Theorem, 256 
Operational calculus, 468-470 
Operator norm topology, 250, 256-257 
Operator on a Hilbert space, 486 
Operator topologies, 247-257 
Optimal control, 151, 580 
Or, Inclusive and Exclusive, 13 
Ordered n-tuple, 17 
Ordered triplet, 17 
Ordered pair, 17 
Orlicz space, 223-224 
Orthogonal complement, 292-305 
Orthogonal projection, 300-305, 310-312, 
340, 371-372, 388-390, 392, 397, 
414, 441, $33 
Orthogonal set, 305 
Orthogonal Structure Theorem, 341-342 
Orthogonality, 231, 283-292 
Orthonormal basis, 306-312, 314-315, 319, 
322—331, 435, 487, 494, 502-503 
Orthonormal set, 305, 441, 477 


Parallelogram Law, 275-276 
Parseval Equality, 307 
Partial ordering, 556 
Partial sums, 224 
Partition, 19, 559 
Partitions (see Equivalence relations) 
Passive mapping, 372-373 
Pauli spin matrices, 195 
Perturbation theory, 436-437 
Piecewise continuity, 71 
Point, of accumulation, 103 
of adherence, 103-105 
boundary, 110 
exterior, 110 
interior, 110 
Pointwise compact, 149-151, 155 
Pointwise convergence (see Strong converg- 
ence of operators) 
Polar decomposition, 377, 464 
Position operator, 495, 497-498, 530-531, 
540 
Positive operator, 370, 378 
Positive square root, 377 
Potential operator, 496-498, 531, 540-541 
Principle of superposition, 166-167, 237- 
238, 241 
Probability, (Probability measure), 599 
Probability space, 599 
Process of Abstraction 
method) 
Product spaces, metric spaces, 58, 147 


(see Axiomatic 


Projection, 201—204 

(See also Orthogonal projection) 
Projection Theorem, 297-298, 301-302 
Proper functional Hilbert space, 350-351 
Pseudometric, 47 
Pseudometric space, 47, 119, 125 
Pseudonorm, 217 
Pure state, 390, 392 
Pythagorean Theorem, 283 


Quadratic form, 277 
Quantum mechanics, 195, 379, 388-392, 
495-496, 539-546 


R-integrable (see Riemann integrable) 
Radius of convergence, 228 
Radon-Nikodym derivative, 597 
Radon-Nikodym Theorem, 596-598, 605 
Random process, 473-475 
Random variable, 404~—407, 599-602 
Range (Range space), 166, 186-187, 195, 
201-204, 211-212, 301-304, 357- 
358, 365, 397-402, 406-407 
Rayleigh-Ritz method, 461~—462 
Reduces, 357 
Reduction, 357 
Regular space, 108 
Relation, 20—22, 556 
Relatively compact (see Compact closure) 
Relatively dense, 320 
Rellich’s Theorem, 506-508 
Reproducing kernel, 350-351 
Resolution of the identity, 397-403, 439- 
441, 466-467 
Resolvent operator, 434 
Resolvent set, 412, 432-433 
Riemann integrable, 560, 585 
Riemann integral, compare with Lebesgue 
integral, 564, 581, 585 
definition, 559-560 
lower, 559-560 
properties of, 561-564, 585 
upper, 559-560 
Riesz Representation Theorem, 
350 
Riesz Theorem, 231, 285-287 


345-346, 


Sample point, 599 

Sample space, 599 

Sampling theory, 336—337 

Scalar field, 161 

Scalar multiplication, 161, 217 

Scattering operators, 379 

Schrodinger operator, 540, 543 

Schwarz Inequality, 273-274, 276, 548-551 

Second moment, 602 

Self-adjoint extension, 534-536, 538-539 

Self-adjoint operator, bounded, 367-373, 
375 379, 391-392, 398, 402-409, 


Self-adjoint operator, bounded (continued) 
414, 435, 447, 453-470, 488~492, 
500-502 

negative part, 464 

norm of, 370-371 

positive, 370, 378 

positive part, 464 

strictly positive, 370, 378 
unbounded, 391, 493, 527-534 

Separable Hilbert space, 313-314, 339 

Separable space, 106-109, 136, 155 

Separation of variables, 520-527 

Sequence(s), convergence almost 
where, 569-572 

convergence of, 69—76, 88, 92, 112-113, 
119 

decreasing, 577 

decreasing sequence of subsets, 118 

eventually in, 80, 88 

increasing, 577 

limit, 70 

of operators, 
257 

uniform convergence, 250—251, 256 

uniform convergence on compact sets, 
156 

Sequence space, 31, 49, 106-107, 114-115, 
218-219, 280-281 

Sequentially compact, 142-146, 155 

Series, absolutely convergent, 225 

convergent and divergent, 225 
partial sums of, 224 
unconditional convergence, 226 

Sesquilinear functional, 277, 346, 374-375 

Sets, ball, 77, 102 

boundary, 77, 110 

bounded, 18, 46, 114, 134-136 
Cantor, 103, 111, 583-585 
closed, 101-112, 116, 290 
compact, 142 

connected, 95, 119 
contractible, 95—96 

convex, 182, 290 

countable, 554 

decreasing sequence of, 118 
dense, 106 

diameter, 46 

disconnected, 95, 107 

empty, 13 

e-net, 136 

exterior, 110 

Hilbert cube, 103, 108, 112, 116 
interior, 77, 110 

local neighborhood, 77-82, 85 
lower bound, 18 

maximal element, 556 
maximal orthonormal, 306, 441 
maximum, 18 

measurable, 582, 587-588 


every- 


strong convergence, 251, 


INDEX 623 


measure zero, 569, 582 
minimum, 18 
neighborhood, 81-82 
null, 566-569, 582 
open, 86-96 
operations, 14-17 
orthogonal, 305 
orthonormal, 305 
spanning, 176-179, 182-183 
sphere, 77, 102 
supremum, 18 
totally bounded, 136-141, 143-144, 155 
uncountable, 554 
underlying, 45 
upper bound, 18, 556 
Set of measure zero, 569, 582 
Set operations, 14-17 
Set theory, 12-22 
Shift operators, 38-40, 179-180, 194, 253, 
257, 263, 359-360, 415-423, 429- 
431, 449, 466 
o-field, 586 
Signal-noise ratio, 471-473 
Simple function, 587 
Sobolev space, 281-282, 506, 510, 514-515, 
518-520 
Space(s), Baire null space, 49-50 
Banach, 217, 267 
compact, 142-157 
complete, 112-113 
conjugate, 208-212, 270 
finite dimensional (see 
sional linear space) 
function, 163 
Hilbert (sec Hilbert space) 
Holder, 221, 223, 510 
homeomorphic, 93-94, 97-101 
inner product (see Inner product space) 
linear, 161-165, 241 
locally compact, 154 
measure, 586-589 
metric, 45-156 
normed linear, 216-224, 264-270 
product, 58-61, 119, 147 
sample, 599 
separable, 106-109, 136, 155, 313-314, 
339 
Sobolev (see Sobolev space) Ne 
Span, 176-179, 182-183 
Spectral analysis, 396-409, 459-470 
Spectral decomposition, 459 
Spectral family, 466 
Spectral Mapping Theorem, 433, 469~470 
Spectral radius, 434 
Spectral Theorem, 
409, 470-476 
compact case, 459-461 
finite dimensional, 398, 403~—404 
unbounded case, 486-488 


Finite-dimen- 


applications of, 404- 


624 INDEX 


Spectral theory, 396 
Spectrum, approximate point spectrum, 413 
continuous spectrum, 412 
definition of, 412 
examples of, 414-431, 445-446 
point spectrum, 412 
properties of, 414-415, 431-437, 449-458, 
466, 488 
residual spectrum, 412 
Sphere, 77, 102 
Square Root Algorithm, 72—73 
Square root of an operator, 377, 470 
Stability, 35 
Standard deviation, 602 
State, 389-390, 392 
Stationary membrane, 516 
Step function, 575, 586 
Stochastic independence, 182, 603-604 
Stochastic process, 607 
Strong convergence of operators, 251 
Strong derivative, 282 
Strongly elliptic operator, 512, 519 
Structure, algebraic, 4, 160—210, 214, 257 
combined topological and algebraic, 4—5, 
213-393 
geometric, 3, 5, 213-393 
set-theoretic, 4, 11-42 
topological, 4, 44-77, 79-96, 214, 216, 
oe Pt Bs) 
inherited, 57 
Sturm-Liouville operator, 498-505 
singular, 504 
Subsequence, 142 
Subspace, metric space, 57 
(See also Linear subspace) 
Sub-o-field, 604—605 
Sum, of linear subspaces, 
340-344 
topological, 340-344 
Sup metric, 51 
Sup norm, 219-220 
Superposition (see Principle of superposi- 
tion) 
Supremum, 18 
Sure event, 599 
Symmetric operator, essentially self-adjoint, 
538-539 
examples of, 495-505, 511 
,properties of, 493-494, 530, 534-536, 539 
Self-adjoint extension of, 534-536, 538— 
539 
semibounded, 537-538 
Systems identification problem, 478-480 


196-200, 297, 


Tchebychev polynomials, 504 

Tietze Extension Theorem, 111 

Time-invariant operator, 38-40, 164, 191, 
199-200, 256, 288, 360, 373, 388, 
428, 478-480 


Time-varying operator, 40, 349 
Topological isomorphism, 257-264, 
387 
Topologically isomorphic, 257-264, 268 
Topological property (Topological invari- 
ant), 94-96, 108, 117, 146-147 
Topological sum, 340-344 
Topologies, coarser, 89 
commensurable, 89-92 
incommensurable, 89-92 
weaker, 89 
Topology, 44, 87, 94-95 
Total ordering, 556 
Total variation, 220 
Totally bounded set, 136-141, 143-144, 155 
Trace, 389, 392, 467 
Transfer function representation, 404, 461 
(See also Multiplier operators) 
Translation (see Shift operators) 
Triangle inequality, 45, 215 


268, 


Unbounded linear operators, 239-240 
domain of, 256 
Unconditional convergence, 226, 309 
Uncountable (uncountably infinite), 13, 554 
Uncountable orthonormal basis, 314—315 
Uniform Boundedness Principle, 255—256 
Uniform continuity, 62, 117, 120, 155, 241 
Uniform equi-continuous, 149, 155 
Uniform homeomorphism, 117-119 
Uniform topology (see Operator 
topology ) 

Unitary equivalence, 331-340 
Unitary operator, 331-340, 358, 375, 377- 
378, 391, 460-461, 467, 534 
Upper bound, 18, 556 
Upper integral, 560 

Upper sum, 559 


norm 


Variance, 602 

Variation, 220 

Vector space (sce Linear space) 
Violin operator, 504-505 


Watson transform, 366-367 

Wave equation, 524-525 

Weak derivative, 282 

Weierstrass M-test, 228 

Weierstrass Theorem, 110 

Weighted sum of projections, 442—449, 459- 
461, 464, 468-470, 482, 486-487, 
528-529, 533, 539 

Weighting function, 24 

Wronskian, 501 


Yes-no experiment, 388-389 


z-transform, 175, 180, 194, 333-334, 339 


Zorn’s Lemma, 556-557 


